python - 使用缓冲区的最长公共(public)前缀？

如果我有一个输入字符串和一个数组:

s = "to_be_or_not_to_be" 
pos = [15, 2, 8]

我试图在引用原始 s 的数组 pos 的连续元素之间找到最长的公共(public)前缀。我正在尝试获得以下输出:

longest = [3,1]

我得到这个的方法是计算以下对的最长公共(public)前缀:

s[15:] 即 _be 和 s[2:] 即 _be_or_not_to_be 给出 3 ( _be )
s[2:] 即 _be_or_not_to_be 和 s[8:] 即 _not_to_be 给出 1 ( _ )

但是，如果 s 很大，我不想在执行类似 s[x:] 的操作时创建多个副本。经过几个小时的搜索，我找到了函数 buffer它只保留输入字符串的一个副本，但我不确定在此上下文中使用它的最有效方法是什么。关于如何实现这一目标的任何建议？

最佳答案

这是一个没有 buffer 的方法，它不会复制，因为它一次只查看一个字符:

from itertools import islice, izip

s = "to_be_or_not_to_be"
pos = [15, 2, 8]


length = len(s)    

for start1, start2 in izip(pos, islice(pos, 1, None)):
    pref = 0
    for pos1, pos2 in izip(xrange(start1, length), xrange(start2, length)):
        if s[pos1] == s[pos2]:
            pref += 1
        else:
            break
    print pref
# prints 3 1

我使用 islice、izip 和 xrange 以防您谈论的字符串可能非常 .

我也无法抗拒这种甚至不需要任何索引的“One Liner”:

[next((i for i, (a, b) in 
    enumerate(izip(islice(s, start1, None), islice(s, start2, None))) 
        if a != b), 
    length - max((start1, start2))) 
 for start1, start2 in izip(pos, islice(pos, 1, None))]

最后一个方法，使用 os.path.commonprefix:

[len(commonprefix((buffer(s, n), buffer(s, m)))) for n, m in zip(pos, pos[1:])]

关于python - 使用缓冲区的最长公共(public)前缀？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/8073808/

python - 使用缓冲区的最长公共(public)前缀？

上一篇：python - 将路径转换为多边形

下一篇：python - 什么是最简单/最轻量级的 WSGI 框架？