我正在尝试编写一个代码,如果字典中存在任何列表元素,则该代码应返回 true。这首曲子的表现真的很重要。我知道如果我找到第一个搜索命中,我可以遍历列表并中断。有没有比下面给出的更快或更Pythonic的方法?
for x in someList:
if x in someDict:
return True
return False
编辑:我正在使用 Python 2.7
。我的第一选择是更快的方法。
最佳答案
使用内置 any在两个循环中可以有一些性能优势
any(x in someDict for x in someList)
但您可能需要测量里程。如果您的 list 和 dict 保持相当静态并且您必须多次执行比较,您可以考虑使用 set
someSet = set(someList)
someDict.viewkeys() & someSet
注意 Python 3.X,默认返回 View 而不是序列,所以在使用 Python 3.X 时会很直接
someSet = set(someList)
someDict.keys() & someSet
在上述两种情况下,您都可以用 bool 值包装结果以获得 bool 结果
bool(someDict.keys() & set(someSet ))
异端笔记
我的好奇心战胜了我,我为所有提出的解决方案计时。 看来您的原始解决方案在性能方面更好。这是结果
随机生成的输入样本
def test_data_gen():
from random import sample
for i in range(1,5):
n = 10**i
population = set(range(1,100000))
some_list = sample(list(population),n)
population.difference_update(some_list)
some_dict = dict(zip(sample(population,n),
sample(range(1,100000),n)))
yield "Population Size of {}".format(n), (some_list, some_dict), {}
测试引擎
我重写了答案的测试部分,因为它很困惑,而且答案受到了相当多的关注。我创建了一个 timeit compare python 模块并将其移至 github
测试结果
Timeit repeated for 10 times
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
======================================
Test Run for Population Size of 10
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_nested |0.000011 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_any |0.000014 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_not_not |0.000015 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 4|foo_imap_any |0.000018 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 5|foo_any |0.000019 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 6|foo_ifilter_next |0.000022 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 7|foo_set_ashwin |0.000024 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.000047 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 100
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_ifilter_any |0.000071 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 2|foo_nested |0.000072 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_not_not |0.000073 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 4|foo_ifilter_next |0.000076 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 5|foo_imap_any |0.000082 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 6|foo_any |0.000092 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 7|foo_set_ashwin |0.000170 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.000638 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 1000
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_ifilter_not_not |0.000746 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_any |0.000746 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_next |0.000752 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 4|foo_nested |0.000771 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 5|foo_set_ashwin |0.000838 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 6|foo_imap_any |0.000842 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 7|foo_any |0.000933 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.001702 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 10000
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_nested |0.007195 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_next |0.007410 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_any |0.007491 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 4|foo_ifilter_not_not |0.007671 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 5|foo_set_ashwin |0.008385 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 6|foo_imap_any |0.011327 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 7|foo_any |0.011533 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.018313 |some_dict.viewkeys() & set(some_list )
Timeit repeated for 100 times
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
======================================
Test Run for Population Size of 10
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_nested |0.000098 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_any |0.000124 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_not_not |0.000131 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 4|foo_imap_any |0.000142 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 5|foo_ifilter_next |0.000151 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 6|foo_any |0.000158 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 7|foo_set_ashwin |0.000186 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.000496 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 100
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_ifilter_any |0.000661 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_not_not |0.000677 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 3|foo_nested |0.000683 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 4|foo_ifilter_next |0.000684 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 5|foo_imap_any |0.000762 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 6|foo_any |0.000854 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 7|foo_set_ashwin |0.001291 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.005018 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 1000
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_ifilter_any |0.007585 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 2|foo_nested |0.007713 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 3|foo_set_ashwin |0.008256 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 4|foo_ifilter_not_not |0.008526 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 5|foo_any |0.009422 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 6|foo_ifilter_next |0.010259 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 7|foo_imap_any |0.011414 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.019862 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 10000
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_imap_any |0.082221 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_any |0.083573 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 3|foo_nested |0.095736 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 4|foo_set_ashwin |0.103427 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 5|foo_ifilter_next |0.104589 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 6|foo_ifilter_not_not |0.117974 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 7|foo_any |0.127739 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.208228 |some_dict.viewkeys() & set(some_list )
Timeit repeated for 1000 times
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
======================================
Test Run for Population Size of 10
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_nested |0.000953 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_any |0.001134 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_not_not |0.001213 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 4|foo_ifilter_next |0.001340 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 5|foo_imap_any |0.001407 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 6|foo_any |0.001535 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 7|foo_set_ashwin |0.002252 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.004701 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 100
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_ifilter_any |0.006209 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_next |0.006411 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_not_not |0.006657 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 4|foo_nested |0.006727 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 5|foo_imap_any |0.007562 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 6|foo_any |0.008262 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 7|foo_set_ashwin |0.012260 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.046773 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 1000
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_ifilter_not_not |0.071888 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_next |0.072150 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 3|foo_nested |0.073382 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 4|foo_ifilter_any |0.075698 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 5|foo_set_ashwin |0.077367 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 6|foo_imap_any |0.090623 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 7|foo_any |0.093301 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.177051 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 10000
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_nested |0.701317 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_next |0.706156 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_any |0.723368 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 4|foo_ifilter_not_not |0.746650 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 5|foo_set_ashwin |0.776704 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 6|foo_imap_any |0.832117 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 7|foo_any |0.881777 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |1.665962 |some_dict.viewkeys() & set(some_list )
Timeit repeated for 10000 times
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
======================================
Test Run for Population Size of 10
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_nested |0.010581 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_any |0.013512 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 3|foo_imap_any |0.015321 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 4|foo_ifilter_not_not |0.017680 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 5|foo_ifilter_next |0.019334 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 6|foo_any |0.026274 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 7|foo_set_ashwin |0.030881 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.053605 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 100
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_nested |0.070194 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_not_not |0.078524 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_any |0.079499 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 4|foo_imap_any |0.087349 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 5|foo_ifilter_next |0.093970 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 6|foo_any |0.097948 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 7|foo_set_ashwin |0.130725 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |0.480841 |some_dict.viewkeys() & set(some_list )
======================================
Test Run for Population Size of 1000
======================================
|Rank |FunctionName |Result |Description
+------+---------------------+----------+-----------------------------------------------
| 1|foo_ifilter_any |0.754491 |any(ifilter(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 2|foo_ifilter_not_not |0.756253 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 3|foo_ifilter_next |0.771382 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+----------+-----------------------------------------------
| 4|foo_nested |0.787152 |Original OPs Code
+------+---------------------+----------+-----------------------------------------------
| 5|foo_set_ashwin |0.818520 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+----------+-----------------------------------------------
| 6|foo_imap_any |0.902947 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+----------+-----------------------------------------------
| 7|foo_any |1.001810 |any(x in some_dict for x in some_list)
+------+---------------------+----------+-----------------------------------------------
| 8|foo_set |2.012781 |some_dict.viewkeys() & set(some_list )
=======================================
Test Run for Population Size of 10000
=======================================
|Rank |FunctionName |Result |Description
+------+---------------------+-----------+-----------------------------------------------
| 1|foo_imap_any |10.071469 |any(imap(some_dict.__contains__, some_list))
+------+---------------------+-----------+-----------------------------------------------
| 2|foo_any |11.127034 |any(x in some_dict for x in some_list)
+------+---------------------+-----------+-----------------------------------------------
| 3|foo_set |18.881414 |some_dict.viewkeys() & set(some_list )
+------+---------------------+-----------+-----------------------------------------------
| 4|foo_nested |8.731133 |Original OPs Code
+------+---------------------+-----------+-----------------------------------------------
| 5|foo_ifilter_not_not |9.019190 |not not next(ifilter(some_dict.__contains__...
+------+---------------------+-----------+-----------------------------------------------
| 6|foo_ifilter_next |9.189966 |bool(next(ifilter(some_dict.__contains__...
+------+---------------------+-----------+-----------------------------------------------
| 7|foo_set_ashwin |9.363886 |not set(some_dct).isdisjoint(some_lst)
+------+---------------------+-----------+-----------------------------------------------
| 8|foo_ifilter_any |9.442759 |any(ifilter(some_dict.__contains__, some_list))
以及来自上述模块的图形比较
结论
过早的优化是邪恶的。很明显,当测试域发生变化时,没有一个解决方案具有最佳性能。根据人口规模和迭代频率,解决方案的性能差异很大。结果再次说明了这样一个事实,即在 Python 中,应该确保代码应该是可读的,而不是确保代码在某些情况下既漂亮又针对性能进行了优化,但它可能无法扩展。
注意有些人怀疑为什么不使用 ifilter 的性能比其他的好
"In Abhit's answer, he timed the different approaches and found that ifilter/next was not the fastest; any idea why this would be the case? "
众所周知,在python中,调用C函数是有开销的,如果种群规模小但迭代频率高,C函数调用开销的积累就会慢慢显现。从图中可以看出,在人口规模较小但迭代次数较高的情况下,使用 ifilter,性能会出现很大差异。
关于python - 优化 Keys 列表的字典成员的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29839397/