我尝试删除两个重复项,例如:
STANGHOLMEN_TA02_GT11
STANGHOLMEN_TA02_GT41
STANGHOLMEN_TA02_GT81
STANGHOLMEN_TA02_GT11
STANGHOLMEN_TA02_GT81
结果
STANGHOLMEN_TA02_GT41
我尝试过这个脚本
lines_seen = set()
with open(example.txt, "w") as output_file:
for each_line in open(example2.txt, "r"):
if each_line not in lines_seen:
output_file.write(each_line)
lines_seen.add(each_line)
但不幸的是,它没有按照我想要的方式工作,它会丢失线条并且不会删除线条。原始文件的行与行之间不时有空格
最佳答案
您需要执行 2 遍才能正常工作。因为通过 1 次,您将不知道当前行是否会在以后重复。你应该尝试这样的事情:
# count each line occurances
lines_count = {}
for each_line in open('example2.txt', "r"):
lines_count[each_line] = lines_count.get(each_line, 0) + 1
# write only the lines that are not repeated
with open('example.txt', "w") as output_file:
for each_line, count in lines_count.items():
if count == 1:
output_file.write(each_line)
关于python - 使用 python 从文本文件中删除两个重复项(原始和重复项),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65542565/