python - Numpy 对象数组

我最近在使用例如创建 Numpy 对象数组时遇到了问题

a = np.array([c], dtype=np.object)

其中 c 是某个复杂类的实例，在某些情况下 Numpy 会尝试访问该类的某些方法。但是，做:

a = np.empty((1,), dtype=np.object)
a[0] = c

解决了这个问题。我很好奇这两者在内部有什么区别。为什么在第一种情况下 Numpy 可能会尝试访问 c 的某些属性或方法？

编辑:作为记录，这里是演示问题的示例代码:

import numpy as np

class Thing(object):

    def __getitem__(self, item):
        print "in getitem"

    def __len__(self):
        return 1

a = np.array([Thing()], dtype='object')

这会打印出 getitem 两次。基本上，如果 __len__ 出现在类中，那么这就是可能会遇到意外行为的时候。

最佳答案

在第一种情况下 a = np.array([c], dtype=np.object)，numpy 对预期数组的形状一无所知。

例如，当你定义

d = range(10)
a = np.array([d])

然后你希望 numpy 根据 d 的长度来确定形状。

与您的情况类似，numpy 将尝试查看是否定义了 len(c)，如果定义了，则通过 访问 c 的元素>c[i].

定义一个类就可以看到效果如

class X(object):
    def __len__(self): return 10
    def __getitem__(self, i): return "x" * i

然后

print numpy.array([X()], dtype=object)

产生

[[ x xx xxx xxxx xxxxx xxxxxx xxxxxxx xxxxxxxx xxxxxxxxx]]

相比之下，在你的第二种情况下

a = np.empty((1,), dtype=np.object)
a[0] = c

那么a的形状就已经确定了。因此 numpy 可以直接分配对象。

然而，在某种程度上，这只是因为 a 是一个向量。如果它被定义为不同的形状，那么方法访问仍然会发生。下面的例子仍然会在一个类上调用 ___getitem__

a = numpy.empty((1, 10), dtype=object)
a[0] = X()
print a

[[ x xx xxx xxxx xxxxx xxxxxx xxxxxxx xxxxxxxx xxxxxxxxx]]

关于python - Numpy 对象数组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7667799/

python - Numpy 对象数组

上一篇：python - 使用 Python 的 socket.gethostbyaddr() 有困难

下一篇：python - 网络驱动程序异常 :can't load profile error in selenium python script