l1 = [['a', 'b', 'c'],
['a', 'd', 'c'],
['a', 'e'],
['a', 'd', 'c'],
['a', 'f', 'c'],
['a', 'e'],
['p', 'q', 'r']]
l2 = [1, 1, 1, 2, 0, 0, 0]
我有两个列表,如上所示。 l1
是一个列表列表,l2
是另一个具有某种分数的列表。
问题:对于 l1
中得分为 0
的所有列表(来自 l2
),找到那些完全不同的列表或者具有最短的长度。
例如:如果我有列表 [1, 2, 3]
、[2, 3]
、[5, 7]
所有得分均为 0,我将选择 [5, 7]
因为这些元素不存在于任何其他列表中,而 [2, 3]
因为它与[1, 2, 3]
但长度较小。
我现在如何做到这一点:
l = [x for x, y in zip(l1, l2) if y == 0]
lx = [(x, y) for x, y in zip(l1, l2) if y > 0]
c = list(itertools.combinations(l, 2))
un_usable = []
usable = []
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection > 0:
if len(i) < len(j):
usable.append(i)
un_usable.append(j)
else:
usable.append(j)
un_usable.append(i)
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection == 0:
if i not in un_usable and i not in usable:
usable.append(i)
if j not in un_usable and j not in usable:
usable.append(j)
final = lx + [(x, 0) for x in usable]
最后给了我:
[(['a', 'b', 'c'], 1),
(['a', 'd', 'c'], 1),
(['a', 'e'], 1),
(['a', 'd', 'c'], 2),
(['a', 'e'], 0),
(['p', 'q', 'r'], 0)]
这是所需的结果。
编辑:处理相等的长度:
l1 = [['a', 'b', 'c'],
['a', 'd', 'c'],
['a', 'e'],
['a', 'd', 'c'],
['a', 'f', 'c'],
['a', 'e'],
['p', 'q', 'r'],
['a', 'k']]
l2 = [1, 1, 1, 2, 0, 0, 0, 0]
l = [x for x, y in zip(l1, l2) if y == 0]
lx = [(x, y) for x, y in zip(l1, l2) if y > 0]
c = list(itertools.combinations(l, 2))
un_usable = []
usable = []
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection > 0:
if len(i) < len(j):
usable.append(i)
un_usable.append(j)
elif len(i) == len(j):
usable.append(i)
usable.append(j)
else:
usable.append(j)
un_usable.append(i)
usable = [list(x) for x in set(tuple(x) for x in usable)]
un_usable = [list(x) for x in set(tuple(x) for x in un_usable)]
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection == 0:
if i not in un_usable and i not in usable:
usable.append(i)
if j not in un_usable and j not in usable:
usable.append(j)
final = lx + [(x, 0) for x in usable]
有没有更好、更快、Python 的方法来实现同样的目的?
最佳答案
假设我正确理解了所有内容,这是一个 O(N) 两遍算法。
步骤:
- 选择得分为零的列表。
- 对于每个零分列表的每个元素,找到该元素出现的最短零分列表的长度。我们将其称为元素的长度分数。
- 对于每个列表,找到列表中所有元素的长度分数的最小值。如果结果小于列表的长度,则丢弃该列表。
def select_lsts(lsts, scores):
# pick out zero score lists
z_lsts = [lst for lst, score in zip(lsts, scores) if score == 0]
# keep track of the shortest length of any list in which an element occurs
len_shortest = dict()
for lst in z_lsts:
ln = len(lst)
for c in lst:
len_shortest[c] = min(ln, len_shortest.get(c, float('inf')))
# check if the list is of minimum length for each of its chars
for lst in z_lsts:
len_lst = len(lst)
if any(len_shortest[c] < len_lst for c in lst):
continue
yield lst
关于python - 根据长度和交集从列表列表中选择元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50618761/