我正在阅读一个文本文件,其中每行包含一些数字和字母。
每行的第一个数字是一个唯一的 ID,我想将所有相同的 ID 复制到一个单独的列表中。
例如,如果我读取文件后的列表是这样的:
[
['507', 'W', '1000', '1'],
['1', 'M', '6', '2'],
['1', 'W', '1400', '3'],
['1', 'M', '8', '8'],
['1', 'T', '101', '10'],
['507', 'M', '4', '12'],
['1', 'W', '1700', '15'],
['1', 'M', '7', '16'],
['507', 'M', '8', '20'],
...
]
预期输出应如下所示:
[
['507', 'W', '1000', '1','507', 'M', '4', '12','507', 'M', '8', '20'],
['1', 'M', '6', '2','1', 'M', '8', '8','1', 'T', '101', '10','1', 'W', '1700', '15','1', 'M', '7', '16']
...
]
对于文件中的所有其他唯一 ID 依此类推。
所有以“507”开头的行应存储在不同的列表中,而以“1”开头的行应存储在另一个列表中,依此类推。
我当前的代码:
import operator
fileName = '/home/salman/Desktop/input.txt'
lineList = []
first_number = []
common_number = []
with open(fileName) as f:
for line in f:
lineList = f.readlines()
lineList.append(line)
lineList = [line.rstrip('\n') for line in open(fileName)]
first_number = [i.split()[0] for i in lineList]
print("Rows in list:" + str(lineList))
print("First number in list : " + str(first_number))
common_number = list(set(first_number))
print("Common Numbers in first number list : "+ str(common_number))
print("Repeated value and their index's are :")
最佳答案
这是我的尝试。首先请阅读groupby上的这篇文档:https://docs.python.org/3/library/itertools.html#itertools.groupby以及首先订购序列的重要性。这里你的键是列表的第一个元素,所以我按它排序。排序:https://docs.python.org/3/howto/sorting.html
展平列表列表:How to make a flat list out of list of lists?
说明:对元素进行排序,以便连续的条目具有相同的键,即第一个元素。当该键发生变化时,我们就知 Prop 有前一个键的所有项目都已用完。所以基本上我们需要找到连续条目的第一个元素发生变化的位置。这就是 groupby
对象提供的功能。它给出一个 (key
, group
) 元组,其中 key
是标识每个组和 group
的第一个元素> 将是具有相同key
的所有列表的生成器(因此生成器实际上只是列表的列表)。我们将它们拆开包装并压平。
import itertools
lst = [
['507', 'W', '1000', '1'],
['1', 'M', '6', '2'],
['1', 'W', '1400', '3'],
['1', 'M', '8', '8'],
['1', 'T', '101', '10'],
['507', 'M', '4', '12'],
['1', 'W', '1700', '15'],
['1', 'M', '7', '16'],
['507', 'M', '8', '20']
]
lst = sorted(lst, key=lambda x: x[0])
groups = itertools.groupby(lst, key=lambda x: x[0])
groups = [[*group] for _, group in groups]
# 3rd element
grp_3rd = [[entry[2] for entry in group] for group in groups]
# you could sum it up right here
grp_3rd = [sum(float(entry[2]) for entry in group) for group in groups]
# or you could do to see each key and the corresponding sum i.e. {'1': 3222.0, '507': 1012.0}
grp_3rd = {group[0][0]: sum(float(entry[2]) for entry in group) for group in groups}
# continue on to your output
flatten = lambda list_: [sublist for l in list_ for sublist in l]
groups = [flatten(group) for group in groups]
输出:
[['1', 'M', '6', '2', '1', 'W', '1400', '3', '1', 'M', '8', '8', '1', 'T', '101', '10', '1','W', '1700', '15', '1', 'M', '7', '16'],
['507', 'W', '1000', '1', '507', 'M', '4', '12', '507', 'M', '8', '20']]
下面塞德里克的答案更容易理解,因此如果您可以轻松理解,那么您可以如何更改它。
rows = [['507', 'W', '1000', '1'],
['1', 'M', '6', '2'],
['1', 'W', '1400', '3'],
['1', 'M', '8', '8'],
['1', 'T', '101', '10'],
['507', 'M', '4', '12'],
['1', 'W', '1700', '15'],
['1', 'M', '7', '16'],
['507', 'M', '8', '20']]
# get the output and sum directly
merged = {}
for row in rows:
if row[0] not in merged:
merged[row[0]] = [[], 0]
merged[row[0]][0].extend(row[1:])
merged[row[0]][1] += float(row[2])
# get the output and the list of 3rd elements
merged = {}
for row in rows:
if row[0] not in merged:
merged[row[0]] = ([], [])
merged[row[0]][0].extend(row[1:])
merged[row[0]][1].append(float(row[2]))
关于python - 如何复制子列表中的公共(public)元素?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60437250/