python - 如何猴子补丁python列表__setitem__方法

标签 python ctypes introspection

我想猴子修补 Python 列表,特别是用自定义代码替换 __setitem__ 方法。请注意,我不是要扩展,而是要覆盖 内置类型。例如:

>>> # Monkey Patch  
... # Replace list.__setitem__ with a Noop
...
>>> myList = [1,2,3,4,5]
>>> myList[0] = "Nope"
>>> myList
[1, 2, 3, 4, 5]

是的,我知道这对 python 代码来说是彻头彻尾的 变态 事情。不,我的用例没有任何意义。尽管如此,可以做到吗?

可能的途径:

示例

我实际上设法覆盖了方法本身,如下所示:

import ctypes

def magic_get_dict(o):
    # find address of dict whose offset is stored in the type
    dict_addr = id(o) + type(o).__dictoffset__
    # retrieve the dict object itself
    dict_ptr = ctypes.cast(dict_addr, ctypes.POINTER(ctypes.py_object))
    return dict_ptr.contents.value

def magic_flush_mro_cache():
    ctypes.PyDLL(None).PyType_Modified(ctypes.cast(id(object), ctypes.py_object))

print(list.__setitem__)
dct = magic_get_dict(list)
dct['__setitem__'] = lambda s, k, v: s
magic_flush_mro_cache()
print(list.__setitem__)

x = [1,2,3,4,5]
print(x.__setitem__)
x.__setitem__(0,10)
x[1] = 20
print(x)

输出如下:

➤ python3 override.py
<slot wrapper '__setitem__' of 'list' objects>
<function <lambda> at 0x10de43f28>
<bound method <lambda> of [1, 2, 3, 4, 5]>
[1, 20, 3, 4, 5]

但是如输出所示,这似乎并不影响设置项的正常语法 (x[0] = 0)

备选方案:猴子修补单个列表实例

作为一个较小的替代方案,如果我能够猴子修补单个列表的实例,这也可以工作。也许通过将列表的类指针更改为自定义类。

最佳答案

聚会有点晚了,但尽管如此,这就是答案。

正如 user2357112 在上面的评论中暗示的那样,修改 dict 是不够的,因为 __getitme__ (和其他双下划线名称)被映射到它们的插槽,并且不会在不调用的情况下更新update_slot(未导出,因此有点棘手)。

受上述评论的启发,这里有一个使 __setitem__ 成为特定列表的空操作的工作示例:

# assuming v3.8 (tested on Windows x64 and Ubuntu x64)
# definition of PyTypeObject: https://github.com/python/cpython/blob/3.8/Include/cpython/object.h#L177
# no extensive testing was performed and I'll let other decide if this is a good idea or not, but it's possible

import ctypes

Py_TPFLAGS_HEAPTYPE = (1 << 9)

# calculate the offset of the tp_flags field
offset  = ctypes.sizeof(ctypes.c_ssize_t) * 1 # PyObject_VAR_HEAD.ob_base.ob_refcnt
offset += ctypes.sizeof(ctypes.c_void_p)  * 1 # PyObject_VAR_HEAD.ob_base.ob_type
offset += ctypes.sizeof(ctypes.c_ssize_t) * 1 # PyObject_VAR_HEAD.ob_size
offset += ctypes.sizeof(ctypes.c_void_p)  * 1 # tp_name
offset += ctypes.sizeof(ctypes.c_ssize_t) * 2 # tp_basicsize+tp_itemsize
offset += ctypes.sizeof(ctypes.c_void_p)  * 1 # tp_dealloc
offset += ctypes.sizeof(ctypes.c_ssize_t) * 1 # tp_vectorcall_offset
offset += ctypes.sizeof(ctypes.c_void_p)  * 7 # tp_getattr+tp_setattr+tp_as_async+tp_repr+tp_as_number+tp_as_sequence+tp_as_mapping
offset += ctypes.sizeof(ctypes.c_void_p)  * 6 # tp_hash+tp_call+tp_str+tp_getattro+tp_setattro+tp_as_buffer

tp_flags = ctypes.c_ulong.from_address(id(list) + offset)
assert(tp_flags.value == list.__flags__) # should be the same

lst1 = [1,2,3]
lst2 = [1,2,3]
dont_set_me = [lst1] # these lists cannot be set

# define new method
orig = list.__setitem__
def new_setitem(self, *args):
    if [_ for _ in dont_set_me if _ is self]: # check for identical object in list
        print('Nope')
    else:
        return orig(self, *args)

tp_flags.value |= Py_TPFLAGS_HEAPTYPE # add flag, to allow type_setattro to continue
list.__setitem__ = new_setitem # set method, this will already call PyType_Modified and update_slot
tp_flags.value &= (~Py_TPFLAGS_HEAPTYPE) # remove flag

print(lst1, lst2)       # > [1, 2, 3] [1, 2, 3]
lst1[0],lst2[0]='x','x' # > Nope
print(lst1, lst2)       # > [1, 2, 3] ['x', 2, 3]

编辑
参见 here为什么不支持它开始。主要是explained by Guido van Rossum :

This is prohibited intentionally to prevent accidental fatal changes to built-in types (fatal to parts of the code that you never though of). Also, it is done to prevent the changes to affect different interpreters residing in the address space, since built-in types (unlike user-defined classes) are shared between all such interpreters.

我还在 cpython 中搜索了 Py_TPFLAGS_HEAPTYPE 的所有用法它们似乎都与 GC 或某些验证有关。

所以我猜如果:

  • 你不改变类型结构(我相信上面没有)
  • 您没有在同一个过程中使用多个口译员
  • 您删除标志并立即将其恢复为单线程状态
  • 当标志被移除时,您实际上并没有做任何会影响 GC 的事情

你会没事的

关于python - 如何猴子补丁python列表__setitem__方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38257613/

相关文章:

cocoa 问题 : How do I get a string representation of a SEL?

python - 403 - CSRF token 丢失或不正确

python - 在 python 中使用隐式欧拉求解 PDE - 输出不正确

c++ - 从 python 调用 int main()

python - 从 python ctypes 调用 CPP 函数

python - Python 中 dir() 和 __dict__ 的最大区别是什么

swift - 是否有 C#'s ' nameof()' 语句的 Swift 等价物?

python - 如何找出列表列表中是否有重复项

python - Flask、SQLAlchemy 和多线程 : MySQL too many connections

Python Ctypes 传递 .h 文件中定义的结构指针。