python - python 的 if 字符串中的子字符串的运行时

以下if语句的大O是什么？

if "pl" in "apple":
   ...

python 如何确定是否在字符串“apple”中找到字符串“pl”的整体大 O 是什么

或字符串搜索中的任何其他子字符串。

这是测试子字符串是否在字符串中的最有效方法吗？它使用与 .find() 相同的算法吗？

最佳答案

时间复杂度平均为 O(N)，最坏情况为 O(NM)(N 是较长字符串的长度，M 是您搜索的较短字符串)。从 Python 3.10 开始，启发式方法用于通过切换算法将最坏情况降低到 O(N + M)。

相同的算法用于str.index()、str.find()、str.__contains__()(in 运算符)和 str.replace()；它是 Boyer-Moore 的简化版想法取自 Boyer–Moore–Horspool和 Sunday算法。

参见 original stringlib discussion post ，以及 fastsearch.h source code ;直到 Python 3.10，自 introduction in Python 2.5 以来基本算法没有改变(除了 some low-level optimisations and corner-case fixes )。

该帖子包含该算法的 Python 代码大纲:

def find(s, p):
    # find first occurrence of p in s
    n = len(s)
    m = len(p)
    skip = delta1(p)[p[m-1]]
    i = 0
    while i <= n-m:
        if s[i+m-1] == p[m-1]: # (boyer-moore)
            # potential match
            if s[i:i+m-1] == p[:m-1]:
                return i
            if s[i+m] not in p:
                i = i + m + 1 # (sunday)
            else:
                i = i + skip # (horspool)
        else:
            # skip
            if s[i+m] not in p:
                i = i + m + 1 # (sunday)
            else:
                i = i + 1
    return -1 # not found

以及速度比较。

在 Python 3.10 中，算法更新为使用 Crochemore and Perrin's Two-Way string searching algorithm 的增强版本对于更大的问题(p 和 s 分别超过 100 和 2100 个字符，s 至少是 的 6 倍p)，响应 pathological edgecase someone reported .添加此更改的提交 included a write-up on how the algorithm works .

Two-way 算法的最坏情况时间复杂度为 O(N + M)，其中 O(M) 是预先支付的成本，用于从 s 构建移位表搜索针。获得该表后，该算法确实具有 O(N/M) 的最佳性能。

关于python - python 的 if 字符串中的子字符串的运行时，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35220418/

python - python 的 if 字符串中的子字符串的运行时

上一篇：python - Scipy 旋转和缩放图像而不改变其尺寸

下一篇：python - 如何使用 Odoo 中的方法设置默认值？