这里我有一个 csv 文件:
b5711586dc018c1deed6b1ea596da304|f4e3945da368711abb3110b621ceada5c21c11f8|bdf7f718f579d64060c7739225de573e4ffda7fe8b10cdaaeb672de5b7c06 98e|2017-01-20 11:42:12|111|Relative|path
1beb1d0ac2d24cb87d8fe6ce05601136|f5ace00777f68909d106719629c85fb3af23b810|62f6ebb14ede7a1b6307cea5f58a18ff59282650af750a575d1bdb530c04f 11f|2017-01-20 11:42:12|111|Relative|path
b5711586dc018c1deed6b1ea596da304|f4e3945da368711abb3110b621ceada5c21c11f8|bdf7f718f579d64060c7739225de573e4ffda7fe8b10cdaaeb672de5b7c06 98e|2017-01-20 11:43:28|111|Relative|path
1beb1d0ac2d24cb87d8fe6ce05601136|f5ace00777f68909d106719629c85fb3af23b810|62f6ebb14ede7a1b6307cea5f58a18ff59282650af750a575d1bdb530c04f 11f|2017-01-20 11:43:28|111|Relative|path
b5711586dc018c1deed6b1ea596da304|f4e3945da368711abb3110b621ceada5c21c11f8|bdf7f718f579d64060c7739225de573e4ffda7fe8b10cdaaeb672de5b7c06 98e|2017-01-20 11:48:03|111|Relative|path
1beb1d0ac2d24cb87d8fe6ce05601136|f5ace00777f68909d106719629c85fb3af23b810|62f6ebb14ede7a1b6307cea5f58a18ff59282650af750a575d1bdb530c04f 11f|2017-01-20 11:48:03|111|Relative|path
但是我想删除多余的行并只保留唯一的行。
有什么办法可以用python写一个脚本来做这个吗? 我使用了以下脚本:
import csv
with open('results/20_01_2017_db_file.csv','rb') as f:
reader = csv.reader(f)
for row in reader:
print ', '.join(row)
最佳答案
with open('results/20_01_2017_db_file.csv','r') as in_file, open('results/20_01_2017_db_unique_file.csv','w') as out_file:
dupl = set()
for line in in_file:
if line in dupl:
dupl.add(line)
out_file.write(line)
关于python - 根据分隔符 | 之前的第一个 id,完全删除 csv 文件中的重复条目行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41757403/