python - binarySearch 与 in,意外结果 (Python)

标签 python time-complexity binary-search

我正在尝试比较 python2 中 in 和 binarySearch 的复杂性。期望 in 的 O(1) 和 binarySearch 的 O(logn)。然而,结果出人意料。程序是否计时不正确或存在其他错误?

代码如下:

import time

x = [x for x in range(1000000)]
def Time_in(alist,item):
    t1  = time.time()
    found = item in alist
    t2 = time.time()
    timer = t2 - t1  
    return found, timer

def Time_binarySearch(alist, item):
    first = 0
    last = len(alist)-1
    found = False 
    t1 = time.time()
    while first<=last and not found:
        midpoint = (first + last)//2
        if alist[midpoint] == item:
            found = True
        else:
            if item < alist[midpoint]:
                last = midpoint-1
            else:
                first = midpoint+1
    t2 = time.time()
    timer = t2 - t1
    return found, timer

print "binarySearch: ", Time_binarySearch(x, 600000)
print "in: ", Time_in(x, 600000)

结果是:

enter image description here

最佳答案

二分查找的速度如此之快,以至于当您尝试打印它花费的时间时,它只会打印出 0.0。而使用 in 花费的时间足够长,您可以看到它花费的时间非常短。

in 确实需要更长时间的原因是因为这是一个列表,而不是 set 或类似的数据结构;而对于集合,成员资格测试介于 O(1) 和 O(logn) 之间,在列表中,每个元素都必须按顺序检查,直到有匹配项,否则列表已用尽。

这是一些基准测试代码:

from __future__ import print_function

import bisect
import timeit


def binarysearch(alist, item):
    first = 0
    last = len(alist) - 1
    found = False
    while first <= last and not found:
        midpoint = (first + last) // 2
        if alist[midpoint] == item:
            found = True
        else:
            if item < alist[midpoint]:
                last = midpoint - 1
            else:
                first = midpoint + 1
    return found


def bisect_index(alist, item):
    idx = bisect.bisect_left(alist, item)
    if idx != len(alist) and alist[idx] == item:
        found = True
    else:
        found = False
    return found


time_tests = [
    ('    600 in list(range(1000))',
     '600 in alist',
     'alist = list(range(1000))'),
    ('    600 in list(range(10000000))',
     '600 in alist',
     'alist = list(range(10000000))'),

    ('    600 in set(range(1000))',
     '600 in aset',
     'aset = set(range(1000))'),
    ('6000000 in set(range(10000000))',
     '6000000 in aset',
     'aset = set(range(10000000))'),

    ('binarysearch(list(range(1000)), 600)',
     'binarysearch(alist, 600)',
     'from __main__ import binarysearch; alist = list(range(1000))'),
    ('binarysearch(list(range(10000000)), 6000000)',
     'binarysearch(alist, 6000000)',
     'from __main__ import binarysearch; alist = list(range(10000000))'),

    ('bisect_index(list(range(1000)), 600)',
     'bisect_index(alist, 600)',
     'from __main__ import bisect_index; alist = list(range(1000))'),
    ('bisect_index(list(range(10000000)), 6000000)',
     'bisect_index(alist, 6000000)',
     'from __main__ import bisect_index; alist = list(range(10000000))'),
    ]

for display, statement, setup in time_tests:
    result = timeit.timeit(statement, setup, number=1000000)
    print('{0:<45}{1}'.format(display, result))

结果:

# Python 2.7

    600 in list(range(1000))                 5.29039907455
    600 in list(range(10000000))             5.22499394417
    600 in set(range(1000))                  0.0402979850769
6000000 in set(range(10000000))              0.0390179157257
binarysearch(list(range(1000)), 600)         0.961972951889
binarysearch(list(range(10000000)), 6000000) 3.014950037
bisect_index(list(range(1000)), 600)         0.421462059021
bisect_index(list(range(10000000)), 6000000) 0.634694814682

# Python 3.4

    600 in list(range(1000))                 8.578510413994081
    600 in list(range(10000000))             8.578105041990057
    600 in set(range(1000))                  0.04088461003266275
6000000 in set(range(10000000))              0.043901249999180436
binarysearch(list(range(1000)), 600)         1.6799193460028619
binarysearch(list(range(10000000)), 6000000) 6.099467994994484
bisect_index(list(range(1000)), 600)         0.5168328559957445
bisect_index(list(range(10000000)), 6000000) 0.7694612839259207

# PyPy 2.6.0 (Python 2.7.9)

    600 in list(range(1000))                 0.122292041779
    600 in list(range(10000000))             0.00196599960327
    600 in set(range(1000))                  0.101480007172
6000000 in set(range(10000000))              0.00759720802307
binarysearch(list(range(1000)), 600)         0.242530822754
binarysearch(list(range(10000000)), 6000000) 0.189949035645
bisect_index(list(range(1000)), 600)         0.132127046585
bisect_index(list(range(10000000)), 6000000) 0.197204828262

关于python - binarySearch 与 in,意外结果 (Python),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31936657/

相关文章:

algorithm - 求解 a^3 + b^4 = c^3 + d^3 最佳运行时间

c - 二分查找代码行

java - 使用 Java 对文本文件执行二进制搜索

python - 日期时间到文件时间 (Python)

python - 将 2 位字符串数字列表转换为 2 位整数列表

c++ - CUDA的 'thrust::min_element'函数的时间复杂度是多少?

javascript - 这个 NumberComplement 函数的时间复杂度是多少?

Java 对大型抽象列表的二分查找

python - 无法创建能够根据用户输入区分整数和 float 的列表

python - 我如何使用迭代来提高效率?