python - 究竟什么是序列？

python docs有点模棱两可

sequence

An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence. Some built-in sequence types are list, str, tuple, and bytes. Note that dict also supports __getitem__() and __len__(), but is considered a mapping rather than a sequence because the lookups use arbitrary immutable keys rather than integers.

The collections.abc.Sequence abstract base class defines a much richer interface that goes beyond just __getitem__() and __len__(), adding count(), index(), __contains__(), and __reversed__(). Types that implement this expanded interface can be registered explicitly using register().

特别是，使用 abc.collections.Sequence作为 recommended by some 的黄金标准这意味着，例如，numpy 数组不是序列:

isinstance(np.arange(6),collections.abc.Sequence)
# False

还有一种叫做 Sequence 的东西。 Protocol但这似乎只在 C-API 中公开。那里的标准是

int PySequence_Check(PyObject *o)

Return 1 if the object provides sequence protocol, and 0 otherwise. Note that it returns 1 for Python classes with a __getitem__() method unless they are dict subclasses since in general case it is impossible to determine what the type of keys it supports. This function always succeeds.

最后，我不会太密切地关注这个新的(-ish)类型注释业务，但我想这也将受益于序列是什么的清晰概念。
所以我的问题既有哲学的一面，也有实际的一面:序列到底是什么？以及如何测试某事是否是序列？理想情况下，以某种方式制作 numpy 数组序列。如果我开始注释，我将如何处理序列？

最佳答案

Python中打字简介
如果您知道什么是结构类型、名义类型和鸭子类型，请跳过。
我认为大部分的困惑源于 typing 的事实。是 3.5 和 3.6 版本之间的临时模块。并且仍然会在 3.7 和 3.8 版本之间发生变化。这意味着 Python 试图通过类型注释处理类型的方式发生了很大变化。
python既是鸭子类型又是名义类型也无济于事。也就是说，当访问对象的属性时，Python 是鸭子类型的。仅在运行时检查对象是否具有属性，并且仅在立即请求时才检查该对象。然而，Python 也有名义上的类型特征(例如 isinstance() 和 issubclass() )。名义类型是一种类型被声明为另一种类型的子类。这可以通过继承，或使用 register() ABCMeta的方法.typing最初使用名义类型的思想引入了它的类型。从 3.8 开始，它正试图允许更多的 Pythonic 结构类型。
结构类型与鸭子类型有关，只是它是在“编译时”而不是运行时考虑的。例如，当一个 linter 试图检测可能的类型错误时——比如如果你要传递一个 dict到一个只接受像元组或列表这样的序列的函数。使用结构类型，一个类 B应被视为 A 的子类型如果它实现了A的所有方法，无论它是否被声明为 A 的子类型(如名义打字)。
回答
序列(小 s)是鸭子类型。序列是提供对其成员的随机访问的任何有序对象集合。具体来说，如果它定义了 __len__和 __getitem__并使用 0 和 n-1 之间的整数索引，那么它是一个序列。序列(大 s)是一种名义类型。也就是说，要成为一个序列，必须通过从序列继承或注册为子类来声明一个类。
numpy 数组是一个序列，但它不是序列，因为它没有注册为序列的子类。也不应该，因为它没有实现 Sequence promise 的完整接口(interface)(缺少 count() 和 index() 之类的东西)。
听起来你想要的是 结构化类型 对于一个序列(小 s)。从 3.8 开始，这可以通过使用 protocols .协议(protocol)定义了一组方法，一个类必须实现这些方法才能被视为协议(protocol)的子类(结构类型)。

from typing import Protocol
import numpy as np

class MySequence(Protocol):
    def __getitem__(self, index):
        raise NotImplementedError
    def __len__(self):
        raise NotImplementedError
    def __contains__(self, item):
        raise NotImplementedError
    def __iter__(self):
        raise NotImplementedError

def f(s: MySequence):
    for i in range(len(s)):
        print(s[i], end=' ')
    print('end')

f([1, 2, 3, 4]) # should be fine
arr: np.ndarray = np.arange(5)
f(arr) # also fine
f({}) # might be considered fine! Depends on your type checker

协议(protocol)是相当新的，因此并非所有 IDE/类型检查器都可能支持它们。我使用的 IDE PyCharm 可以。它不喜欢 f({}) ，但很高兴将一个 numpy 数组视为一个序列(大 S)(也许并不理想)。您可以使用 runtime_checkable 启用协议(protocol)的运行时检查。 typing 的装饰师.请注意，所有这些都是单独检查每个协议(protocol)方法是否可以在给定的对象/类上找到。因此，如果您的协议(protocol)有很多方法，它可能会变得非常昂贵。

关于python - 究竟什么是序列？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62970581/

python - 究竟什么是序列？

上一篇：c - 如何打印浮点值以供以后以完美的精度进行扫描？

下一篇：io - 保存许多不同形状的张量的最佳方法？