python - Numpy __getitem__ 延迟评估和 a[-1 :] not the same as a[slice(-1, None, none)]

标签 python numpy magic-methods

所以这是关于我假设的两个问题与我的基本相同的基本混淆。我希望没关系。

这里有一些代码:

import numpy as np

class new_array(np.ndarray):

    def __new__(cls, array, foo):
        obj = array.view(cls)
        obj.foo = foo
        return obj

    def __array_finalize__(self, obj):
        print "__array_finalize"
        if obj is None: return
        self.foo = getattr(obj, 'foo', None)

    def __getitem__(self, key):
        print "__getitem__"
        print "key is %s"%repr(key)
        print "self.foo is %d, self.view(np.ndarray) is %s"%(
            self.foo,
            repr(self.view(np.ndarray))
            )
        self.foo += 1
        return super(new_array, self).__getitem__(key)

print "Block 1"
print "Object construction calls"
base_array = np.arange(20).reshape(4,5)
print "base_array is %s"%repr(base_array)
p = new_array(base_array, 0)
print "\n\n"

print "Block 2"
print "Call sequence for p[-1:] is:"
p[-1:]
print "p[-1].foo is %d\n\n"%p.foo

print "Block 3"
print "Call sequence for s = p[-1:] is:"
s = p[-1:]
print "p[-1].foo is now %d"%p.foo
print "s.foo is now %d"%s.foo
print "s.foo + p.foo = %d\n\n"%(s.foo + p.foo)

print "Block 4"
print "Doing q = s + s"
q = s + s
print "q.foo = %d\n\n"%q.foo

print "Block 5"
print "Printing s"
print repr(s)
print "p.foo is now %d"%p.foo
print "s.foo is now %d\n\n"%s.foo

print "Block 6"
print "Printing q"
print repr(q)
print "p.foo is now %d"%p.foo
print "s.foo is now %d"%s.foo
print "q.foo is now %d\n\n"%q.foo

print "Block 7"
print "Call sequence for p[-1]"
a = p[-1]
print "p[-1].foo is %d\n\n"%a.foo

print "Block 8"
print "Call sequence for p[slice(-1, None, None)] is:"
a = p[slice(-1, None, None)]
print "p[slice(None, -1, None)].foo is %d"%a.foo
print "p.foo is %d"%p.foo
print "s.foo + p.foo = %d\n\n"%(s.foo + p.foo)

这段代码的输出是
Block 1
Object construction calls
base_array is array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
__array_finalize



Block 2
Call sequence for p[-1:] is:
__array_finalize
p[-1].foo is 0


Block 3
Call sequence for s = p[-1:] is:
__array_finalize
p[-1].foo is now 0
s.foo is now 0
s.foo + p.foo = 0


Block 4
Doing q = s + s
__array_finalize
q.foo = 0


Block 5
Printing s
__getitem__
key is -1
self.foo is 0, self.view(np.ndarray) is array([[15, 16, 17, 18, 19]])
__array_finalize
__getitem__
key is -5
self.foo is 1, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
__getitem__
key is -4
self.foo is 2, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
__getitem__
key is -3
self.foo is 3, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
__getitem__
key is -2
self.foo is 4, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
__getitem__
key is -1
self.foo is 5, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
new_array([[15, 16, 17, 18, 19]])
p.foo is now 0
s.foo is now 1


Block 6
Printing q
__getitem__
key is -1
self.foo is 0, self.view(np.ndarray) is array([[30, 32, 34, 36, 38]])
__array_finalize
__getitem__
key is -5
self.foo is 1, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
__getitem__
key is -4
self.foo is 2, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
__getitem__
key is -3
self.foo is 3, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
__getitem__
key is -2
self.foo is 4, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
__getitem__
key is -1
self.foo is 5, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
new_array([[30, 32, 34, 36, 38]])
p.foo is now 0
s.foo is now 1
q.foo is now 1


Block 7
Call sequence for p[-1]
__getitem__
key is -1
self.foo is 0, self.view(np.ndarray) is array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
__array_finalize
p[-1].foo is 1


Block 8
Call sequence for p[slice(-1, None, None)] is:
__getitem__
key is slice(-1, None, None)
self.foo is 1, self.view(np.ndarray) is array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
__array_finalize
p[slice(None, -1, None)].foo is 2
p.foo is 2
s.foo + p.foo = 3

请注意两点:
  • 调用 p[-1:]不会导致调用 new_array.__getitem__ .如果 p[-1:],这是真的被 p[0:] 之类的东西取代, p[0:-1] ,等等...但是像 p[-1] 这样的语句和 p[slice(-1, None, None)]确实导致调用 new_array.__getitem__ .对于像 p[-1:] + p[-1:] 这样的语句也是如此。或 s = p[-1]但对于像 print s 这样的语句来说不是这样的.您可以通过查看上面给出的“ block ”来看到这一点。
  • 变量 foo在调用 new_array.__getitem__ 期间正确更新(参见 block 5 和 6)但一旦评估 new_array.__getitem__ 就不正确已完成(再次参见 block 5 和 block 6)。我还应该添加替换行 return super(new_array, self).__getitem__(key)return new_array(np.array(self.view(np.ndarray)[key]), self.foo)也不起作用。以下 block 是输出的唯一区别。
    Block 5
    Printing s
    __getitem__
    key is -1
    self.foo is 0, self.view(np.ndarray) is array([[15, 16, 17, 18, 19]])
    __array_finalize__
    __getitem__
    key is -5
    self.foo is 1, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    __getitem__
    key is -4
    self.foo is 2, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    __getitem__
    key is -3
    self.foo is 3, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    __getitem__
    key is -2
    self.foo is 4, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    __getitem__
    key is -1
    self.foo is 5, self.view(np.ndarray) is array([15, 16, 17, 18, 19])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    new_array([[15, 16, 17, 18, 19]])
    p.foo is now 0
    s.foo is now 1
    
    
    Block 6
    Printing q
    __getitem__
    key is -1
    self.foo is 0, self.view(np.ndarray) is array([[30, 32, 34, 36, 38]])
    __array_finalize__
    __getitem__
    key is -5
    self.foo is 1, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    __getitem__
    key is -4
    self.foo is 2, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    __getitem__
    key is -3
    self.foo is 3, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    __getitem__
    key is -2
    self.foo is 4, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    __getitem__
    key is -1
    self.foo is 5, self.view(np.ndarray) is array([30, 32, 34, 36, 38])
    __array_finalize__
    __array_finalize__
    __array_finalize__
    new_array([[30, 32, 34, 36, 38]])
    p.foo is now 0
    s.foo is now 1
    q.foo is now 1
    

    现在包含对 new_array.__array_finalize__ 的过多调用,但没有
    用变量 foo 改变“问题” .
  • 我期望像 p[-1:] 这样的电话到 new_array对象 p.foo = 0将导致此语句 p.foo == 1返回 True .显然事实并非如此,即使 foo在调用 __getitem__ 期间被正确更新, 因为像 p[-1:] 这样的语句导致对 __getitem__ 的大量调用(一旦考虑到延迟评估)。另外来电p[-1:]p[slice(-1, None, None)]将导致 foo 的不同值(如果计数工作正常)。在前一种情况下 foo本来会有 5添加到它,而在后一种情况下 foo本来会有 1添加到它。

  • 问题

    虽然对 numpy 数组切片的延迟评估不会在评估我的代码期间引起问题,但使用 pdb 调试我的一些代码是一个巨大的痛苦。基本上语句在运行时和在 pdb 中的计算方式不同。我觉得这不太好。这就是我偶然发现这种行为的方式。

    我的代码使用输入到 __getitem__评估应该返回什么类型的对象。在某些情况下,它返回相同类型的新实例,在其他情况下,它返回某个其他类型的新实例,在其他情况下,它返回一个 numpy 数组、标量或浮点数(取决于底层 numpy 数组认为是正确的) )。我使用传递给 __getitem__ 的 key 确定要返回的正确对象是什么。但是如果用户传递了一个切片,我就不能这样做,例如类似 p[-1:] ,因为该方法只获取单个索引,例如好像用户写了p[4] . 那么如果 key 我该怎么做呢?在 __getitem__我的 numpy 子类没有反射(reflect)用户是否请求切片,由 p[-1:] 给出,或者只是一个条目,由 p[4] 给出?

    作为旁注 numpy indexing文档意味着切片对象,例如slice(start, stop, step)将被视为与 start:stop:step 之类的语句相同.这让我觉得我错过了一些非常基本的东西。暗示这一点的句子很早就出现了:

    Basic slicing occurs when obj is a slice object (constructed by start:stop:step notation inside of brackets), an integer, or a tuple of slice objects and integers.



    我不禁觉得同样的基本错误也是我认为self.foo += 1的原因。行应该计算用户请求切片或 new_array 实例的元素的次数。 (而不是切片“中”的元素数量)。 这两个问题实际上是否相关,如果是,如何相关?

    最佳答案

    你确实被一个讨厌的错误咬了。知道我不是唯一的人,真是一种解脱!幸运的是,它很容易解决。只需在您的类(class)中添加如下内容即可。这实际上是我几个月前写的一些代码的复制粘贴,文档字符串告诉了发生了什么,但你可能想阅读 the python docs以及。

    def __getslice__(self, start, stop) :
        """This solves a subtle bug, where __getitem__ is not called, and all
        the dimensional checking not done, when a slice of only the first
        dimension is taken, e.g. a[1:3]. From the Python docs:
           Deprecated since version 2.0: Support slice objects as parameters
           to the __getitem__() method. (However, built-in types in CPython
           currently still implement __getslice__(). Therefore, you have to
           override it in derived classes when implementing slicing.)
        """
        return self.__getitem__(slice(start, stop))
    

    关于python - Numpy __getitem__ 延迟评估和 a[-1 :] not the same as a[slice(-1, None, none)],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14553485/

    相关文章:

    python - 按换行符通配符反斜杠分割

    c# - 如何使内联数组初始化像例如字典初始化?

    Python 类方法链接

    python - 聪明的基于流的 python 程序不会遇到无限递归

    python - 部署Django,supervisorctl异常终止

    python - 跳过 numpy.apply_along_axis 内的 RuntimeError

    python - 从节点云中查找几何(形状)

    跨不同版本的 Python __getitem__ 行为

    当脚本不在同一文件夹中时,Python、os.system 失败

    python - Matplotlib imshow 偏移量以匹配轴?