所以我有一个像这样的元组列表:
[
('Worksheet',),
('1a', 'Calculated'),
('None', 'None', 'None', 'None', 'None'),
('1b', 'General'),
('1b', 'General', 'Basic'),
('1b', 'General', 'Basic', 'Data'),
('1b', 'General', 'Basic', 'Data', 'Line 1'),
('1b', 'General', 'Basic', 'Data', 'Line 2'),
('None', 'None', 'None', 'None', 'None'),
('1c', 'General'),
('1c', 'General', 'Basic'),
('1c', 'General', 'Basic', 'Data'),
('None', 'None', 'None', 'None', 'None'),
('2', 'Active'),
('2', 'Active', 'Passive'),
('None', 'None', 'None', 'None', 'None'),
...
]
每个元组的长度为 1-5。我需要递归地减少列表以得到以下结果:
[
('Worksheet',),
('1a', 'Calculated'),
('None', 'None', 'None', 'None', 'None'),
('1b', 'General', 'Basic', 'Data', 'Line 1'),
('1b', 'General', 'Basic', 'Data', 'Line 2'),
('None', 'None', 'None', 'None', 'None'),
('1c', 'General', 'Basic', 'Data'),
('None', 'None', 'None', 'None', 'None'),
('2', 'Active', 'Passive'),
('None', 'None', 'None', 'None', 'None'),
...
]
基本上,如果下一行与上一行中的所有内容匹配,+1 将其删除到具有相同层次结构的元组的最大长度。
如我的示例所示,有 3 行,其中 1c
是元组中的第一项,因此它被减少到最长的。
最佳答案
def is_subtuple(tup1, tup2):
'''Return True if all the elements of tup1 are consecutively in tup2.'''
if len(tup2) < len(tup1): return False
try:
offset = tup2.index(tup1[0])
except ValueError:
return False
# This could be wrong if tup1[0] is in tup2, but doesn't start the subtuple.
# You could solve this by recurring on the rest of tup2 if this is false, but
# it doesn't apply to your input data.
return tup1 == tup2[offset:offset+len(tup1)]
然后,只需过滤您的输入列表(此处名为 l
):
[t for i, t in enumerate(l) if not any(is_subtuple(t, t2) for t2 in l[i+1:])]
现在,这个列表理解假设输入列表的排序方式与您显示的方式一致,子元组早于它们所在的元组。它也有点昂贵(O(n**2 )
,我想),但它会完成工作。
关于python - 递归减少元组列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19543636/