python - 如何设置嵌套 numpy ndarray 的 dtype？

我正在研究以下数据结构，我试图从中创建一个包含所有数据的 ndarray:

      instrument         filter             response
-----------------------------------------------------
       spire              250um           array of response
         ...               ...                ...

where the array of response is:
      linenumber      wavelangth      throughput
-----------------------------------------------------
         0     1.894740e+06           0.000e+00
         1     2.000000e+06           1.000e-02
         2     2.026320e+06           3.799e-02
        ...              ....              ....

所以，我希望可以使用以下代码将数据转换为一个 ndarray:

import numpy as np

data = [('spire', '250um', [(0, 1.89e6, 0.0), (1,2e6, 1e-2), (2,2.02e6,3.8e-2), ...]),
        ('spire', '350', [ (...), (...), ...]),
        ...,
        ]
table = np.array(data, dtype=[('instrument', '|S32'),
                               ('filter', '|S64'),
                               ('response', [('linenumber', 'i'),
                                             ('wavelength', 'f'),
                                             ('throughput', 'f')])
                              ])

此代码会引发异常，因为存在 list(tuple, list(tuple)) 模式。将数据更改为:

 data = [('spire', '250um', np.array([(0, 1.89e6, 0.0), (1,2e6, 1e-2), (2,2.02e6,3.8e-2), ...],
                                     dtype=[('linenumber','i'), ('wavelength','f'), ('throughput','f')])),
        ('spire', '350', np.array([ (...), (...), ...],dtype=[...])),
        ...,
        ]]

然后代码就可以运行了，但是结果是错误的，因为对于response字段，只取了response数组的第一个条目:

>>print table[0]

('spire', '250um', (0,1.89e6,0.0))

而不是整个数组。

我的问题是，如何正确设置 dtype 关键字才能实现此功能？在这两种情况下: 1. 元组的嵌套列表，其中包含元组列表； 2. 元组的嵌套列表，其中包含非齐次 ndarray。

提前谢谢您!

最佳答案

如果响应数组具有固定长度，我就可以让它工作(也许 Numpy 必须能够预先计算结构化数组中每个记录的大小？)。如 the Numpy manual page for structured arrays 中所述，您可以指定结构化数组中字段的形状。

import numpy as np

data = [('spire', '250um', [(0, 1.89e6, 0.0), (1, 2e6, 1e-2)]),
        ('spire', '350',   [(0, 1.89e6, 0.0), (2, 2.02e6, 3.8e-2)])
        ]
table = np.array(data, dtype=[('instrument', '|S32'),
                               ('filter', '|S64'),
                               ('response', [('linenumber', 'i'),
                                             ('wavelength', 'f'),
                                             ('throughput', 'f')], (2,))
                              ])

print table[0]
# gives ('spire', '250um', [(0, 1890000.0, 0.0), (1, 2000000.0, 0.009999999776482582)])

关于python - 如何设置嵌套 numpy ndarray 的 dtype？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19201868/

python - 如何设置嵌套 numpy ndarray 的 dtype？

上一篇：python - 正则表达式中的原始字符串表示法

下一篇：python - 选择具有存储值的页面元素