我正在尝试比较 python2 中 in 和 binarySearch 的复杂性。期望 in 的 O(1) 和 binarySearch 的 O(logn)。然而,结果出人意料。程序是否计时不正确或存在其他错误?
代码如下:
import time
x = [x for x in range(1000000)]
def Time_in(alist,item):
t1 = time.time()
found = item in alist
t2 = time.time()
timer = t2 - t1
return found, timer
def Time_binarySearch(alist, item):
first = 0
last = len(alist)-1
found = False
t1 = time.time()
while first<=last and not found:
midpoint = (first + last)//2
if alist[midpoint] == item:
found = True
else:
if item < alist[midpoint]:
last = midpoint-1
else:
first = midpoint+1
t2 = time.time()
timer = t2 - t1
return found, timer
print "binarySearch: ", Time_binarySearch(x, 600000)
print "in: ", Time_in(x, 600000)
结果是:
最佳答案
二分查找的速度如此之快,以至于当您尝试打印它花费的时间时,它只会打印出 0.0
。而使用 in
花费的时间足够长,您可以看到它花费的时间非常短。
in
确实需要更长时间的原因是因为这是一个列表,而不是 set
或类似的数据结构;而对于集合,成员资格测试介于 O(1) 和 O(logn) 之间,在列表中,每个元素都必须按顺序检查,直到有匹配项,否则列表已用尽。
这是一些基准测试代码:
from __future__ import print_function
import bisect
import timeit
def binarysearch(alist, item):
first = 0
last = len(alist) - 1
found = False
while first <= last and not found:
midpoint = (first + last) // 2
if alist[midpoint] == item:
found = True
else:
if item < alist[midpoint]:
last = midpoint - 1
else:
first = midpoint + 1
return found
def bisect_index(alist, item):
idx = bisect.bisect_left(alist, item)
if idx != len(alist) and alist[idx] == item:
found = True
else:
found = False
return found
time_tests = [
(' 600 in list(range(1000))',
'600 in alist',
'alist = list(range(1000))'),
(' 600 in list(range(10000000))',
'600 in alist',
'alist = list(range(10000000))'),
(' 600 in set(range(1000))',
'600 in aset',
'aset = set(range(1000))'),
('6000000 in set(range(10000000))',
'6000000 in aset',
'aset = set(range(10000000))'),
('binarysearch(list(range(1000)), 600)',
'binarysearch(alist, 600)',
'from __main__ import binarysearch; alist = list(range(1000))'),
('binarysearch(list(range(10000000)), 6000000)',
'binarysearch(alist, 6000000)',
'from __main__ import binarysearch; alist = list(range(10000000))'),
('bisect_index(list(range(1000)), 600)',
'bisect_index(alist, 600)',
'from __main__ import bisect_index; alist = list(range(1000))'),
('bisect_index(list(range(10000000)), 6000000)',
'bisect_index(alist, 6000000)',
'from __main__ import bisect_index; alist = list(range(10000000))'),
]
for display, statement, setup in time_tests:
result = timeit.timeit(statement, setup, number=1000000)
print('{0:<45}{1}'.format(display, result))
结果:
# Python 2.7
600 in list(range(1000)) 5.29039907455
600 in list(range(10000000)) 5.22499394417
600 in set(range(1000)) 0.0402979850769
6000000 in set(range(10000000)) 0.0390179157257
binarysearch(list(range(1000)), 600) 0.961972951889
binarysearch(list(range(10000000)), 6000000) 3.014950037
bisect_index(list(range(1000)), 600) 0.421462059021
bisect_index(list(range(10000000)), 6000000) 0.634694814682
# Python 3.4
600 in list(range(1000)) 8.578510413994081
600 in list(range(10000000)) 8.578105041990057
600 in set(range(1000)) 0.04088461003266275
6000000 in set(range(10000000)) 0.043901249999180436
binarysearch(list(range(1000)), 600) 1.6799193460028619
binarysearch(list(range(10000000)), 6000000) 6.099467994994484
bisect_index(list(range(1000)), 600) 0.5168328559957445
bisect_index(list(range(10000000)), 6000000) 0.7694612839259207
# PyPy 2.6.0 (Python 2.7.9)
600 in list(range(1000)) 0.122292041779
600 in list(range(10000000)) 0.00196599960327
600 in set(range(1000)) 0.101480007172
6000000 in set(range(10000000)) 0.00759720802307
binarysearch(list(range(1000)), 600) 0.242530822754
binarysearch(list(range(10000000)), 6000000) 0.189949035645
bisect_index(list(range(1000)), 600) 0.132127046585
bisect_index(list(range(10000000)), 6000000) 0.197204828262
关于python - binarySearch 与 in,意外结果 (Python),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31936657/