python - 朴素的字符串搜索算法 - Python

我已经实现了下面的朴素字符串搜索算法 ('find')。它尽可能简单。然而，我在“GeeksforGeeks”上找到了另一种方法，('search') 它看起来具有更好的复杂性。当我对大字符串进行测试时，结果截然不同但相反。

第 1 步:将字符串切成模式长度并进行比较。向前移动一个字符切片并进行比较。这应该是什么复杂性？

def find(pat, txt):
    size = len(pat)
    for i in range( len(txt) -size + 1 ):
        if txt[i : i + size] == pat:
            print 'Pattern found at index %s'%(i)

第二:逐字比较。如果一个字符不匹配中断。否则继续。最后如果所有字符都匹配则打印结果。向前移动一个字符。这应该是什么复杂性？

def search(pat, txt):
    M = len(pat)
    N = len(txt)

    for i in xrange(N-M+1):
        status = 1 
        for j in xrange(M):
            if txt[i+j] != pat[j]:
                status = 0
                break
        if j == M-1 and status != 0:
            print "Pattern found at index " + str(i)

时序测试用例:

testString = ''.join([ 'a' for _ in range(1000*100)] ) + 'b'
testPattern = ''.join([ 'a' for _ in range(100*100) ])  + 'b'

import cProfile
cProfile.run('find(testPattern, testString)')
cProfile.run('search(testPattern, testString)')

用于查找

Pattern found at index 90000
         90007 function calls in 0.160 seconds

用于搜索

Pattern found at index 90000
         5 function calls in 135.951 seconds

在我的算法 find 中，我进行切片和比较。切片的时间复杂度是O(k)，类似的比较应该是另一个O(k)虽然不确定。 Python Time Complexity

在 search 中，我们只运行循环 'k' 次。所以它不应该有更好的时间复杂度。

最佳答案

您的两个算法本质上是相同的(正如@Zah 指出的那样)，唯一的区别是第二个算法中的内部循环是由第一个算法中的底层 C 代码完成的。您观察到的是编译代码和解释代码之间的区别。

如果您想要所有索引并想利用内置方法:

def findAll(s,t):
    """returns all indices where substring t occurs in string s"""
    indices = []
    i = s.find(t)
    while i > -1:
        indices.append(i)
        i = s.find(t,i+1)
    return indices

例如，

>>> findAll("The cat in the hat","at")
[5, 16]

关于python - 朴素的字符串搜索算法 - Python，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41199057/

python - 朴素的字符串搜索算法 - Python

上一篇：algorithm - [1+2a+3a^2+4a^3+....+ba^(b-1)] MOD M 是否有 O(logn) 解

下一篇：c - 如何检查 C 中某个值在哪个数组中？