我有两个列表:
A = [[2, 5, 13, 14], [4, 5, 10, 12], [2, 9, 10, 11], [2, 5, 12, 13], [4, 5, 6, 12]]
B = [12, 5]
我试图找出 A 中的哪些列表包含 B 中的元素(顺序无关紧要)并删除其余列表。
在这种情况下,答案是:
[[4, 5, 10, 12], [2, 5, 12, 13], [4, 5, 6, 12]]
如果我们改变 B 并使其成为 B = [13]
,答案将是:
[[2, 5, 13, 14], [2, 5, 12, 13]]
您可以使用 set.issubset
和列表理解,使用 A[:]
将更改原始/列表对象 A:
A = [[2, 5, 13, 14], [4, 5, 10, 12], [2, 9, 10, 11], [2, 5, 12, 13], [4, 5, 6, 12]]
B = [12, 5]
st = set(B)
A [:] = [sub for sub in A if st.issubset(sub)]
print(A)
[[4, 5, 10, 12], [2, 5, 12, 13], [4, 5, 6, 12]]
对于 B = [13] 也是如此
A = [[2, 5, 13, 14], [4, 5, 10, 12], [2, 9, 10, 11], [2, 5, 12, 13], [4, 5, 6, 12]]
B = [13]
st = set(B)
A [:] = [sub for sub in A if st.issubset(sub)]
print(A)
[[2, 5, 13, 14], [2, 5, 12, 13]]
set objects
s.issubset(t) s <= t test whether every element in s is in t
对于非常大的 A 或者如果您有内存限制,您可以使用生成器表达式:
A [:] = (sub for sub in A if st.issubset(sub))
如果顺序无关紧要并且可以设置,我建议您从一开始就使用它们。在集合上进行查找会更有效率。
稍微大一点的 A 上的一些时间:
In [23]: A = [[2, 5, 13, 14], [4, 5, 10, 12], [2, 9, 10, 11], [2, 5, 12, 13], [4, 5, 6, 12],[2, 5, 13, 14], [4, 5, 10, 12], [2, 9, 10, 11], [2, 5, 12, 13], [4, 5, 6, 12],[2, 5, 13, 14], [4, 5, 10, 12], [2, 9, 10, 11], [2, 5, 12, 13], [4, 5, 6, 12]]
In [24]: B = [12, 5]
In [25]: timeit filter(lambda x: all(y in x for y in B), A)
100000 loops, best of 3: 9.45 µs per loop
In [26]: %%timeit
st = set(B)
[sub for sub in A if st.issubset(sub)]
....:
100000 loops, best of 3: 3.88 µs per loop
map(lambda x: not B_set-set(x), A)
In [27]: %%timeit
....: B_set = set(B)
....: map(lambda x: not B_set-set(x), A)
....:
100000 loops, best of 3: 6.95 µs per loop
如果您已经将元素作为集合存储在 A 中:
In [33]: %%timeit
st = set(B)
[sub for sub in A if sub >= st]
....:
1000000 loops, best of 3: 1.12 µs per loop