我在名为 data 的列表中只有几个文件名。我想读取文件的内容并检查给定的文本(示例 - 橙色)是否出现在文件中。我的文件名按顺序附加到列表中,即如果给定文本“orange”出现在文件 pi.txt(索引 2)中,它将出现在索引 2 之后的所有文件中,当然我想得到文本“orange”首次出现的索引或文件名。
我的列表中有一千多个文件,因此我想使用二分搜索。
data = ['ae.txt', 'ac.txt', 'pi.txt', 'ad.txt', 'mm.txt', 'ab.txt']
target = "orange"
def binary_search(a, x):
lo = 0
hi = len(a)
while lo < hi:
mid = (lo + hi) // 2
if not x in open(a[mid]).read():
lo = mid + 1
elif x in open(a[mid]).read():
hi = mid
elif mid > 0 and x in open(a[mid-1]).read():
hi = mid
else:
return mid
return -1
print(binary_search(data, target))
$ cat ae.txt
papaya
guava
$ cat ac.txt
mango
durian
papaya
guava
$ cat pi.txt
orange
papaya
guava
$ cat ad.txt
orange
papaya
guava
$ cat mm.txt
orange
papaya
guava
$ cat ab.txt
orange
papaya
guava
最佳答案
我认为 if 条件有点太多,你可以像这样得到预期的结果:
data = ['ae.txt', 'ac.txt', 'pi.txt', 'ad.txt', 'mm.txt', 'ab.txt']
target = "orange"
def binary_search(a, x):
lo = 0
hi = len(a)
while lo < hi:
mid = (lo + hi) // 2
print(mid)
if not x in open(a[mid]).read():
lo = mid + 1
elif x in open(a[mid]).read():
hi = mid
if lo == hi:
return lo
print("low : {}; high : {}".format(lo,hi))
return -1
index = binary_search(data, target)
print("The index where we first found the word orange is {}, the file name is {}".format(index,data[index]))
<小时/>
The index where we first found the word orange is 2, the file name is pi.txt
关于python - 在 python 中使用二分搜索来搜索文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56771581/