我有 2 个 CSV,分别是 New.csv 和 Old.csv,它们大约有 1K 行和 10 列,其结构如下:
如果 new.csv 中有一个 longName(第一列),而 old.csv 中没有,我希望将整个 new.csv 行追加到changes.csv 中。
我一开始就这么做了,但效果并不好:
def deltaFileMaker():
with open('Old.csv', 'r', encoding='utf-8') as t1, open('New.csv', 'r', encoding='utf-8') as t2:
fileone = t1.readlines()
filetwo = t2.readlines()
with open('changes.csv', 'w', encoding='utf-8') as outFile:
for line in filetwo:
if line not in fileone:
outFile.write(line)
deltaFileMaker()
我也尝试使用 csv-diff,但找不到将其输出转换为 csv 文件的方法
更新
def deltaFileMaker():
from csv_diff import load_csv, compare
diff = compare(
load_csv(open("old.csv",encoding="utf8"), key="longName"),
load_csv(open("new.csv",encoding="utf8"), key="longName")
)
with open('changes.csv', 'w',encoding="utf8") as f:
w = csv.DictWriter(f, diff.keys())
w.writeheader()
w.writerow(diff)
deltaFileMaker()
最佳答案
你看过csv-diff
吗?他们的website有一个可能合适的示例:
from csv_diff import load_csv, compare
diff = compare(
load_csv(open("one.csv"), key="id"),
load_csv(open("two.csv"), key="id")
)
这应该返回一个 dict
对象,您可以将其解析为 CSV 文件。要将那个字典解析为行,这是一个示例。 注意:正确编写更改很困难,但这更多的是概念验证 - 根据您的意愿进行修改
from csv_diff import load_csv, compare
from csv import DictWriter
# Get all the row headers across all the changes
headers = set({'change type'})
for key, vals in diff.items():
for val in vals: # Multiple of the same difference 'type'
headers = headers.union(set(val.keys()))
# Write changes to file
with open('changes.csv', 'w', encoding='utf-8') as fh:
w = DictWriter(fh, headers)
w.writeheader()
for key, changes in diff.items():
for val in changes: # Add each instance of this type of change
val.update({'change type': key}) # Add 'change type' data
w.writerow(val)
对于文件one.csv
:
id, name, age
1, Cleo, 4
2, Pancakes, 2
和two.csv
:
id, name, age
1, Cleo, 5
3, Bailey, 1
4, Elliot, 10
运行此命令会产生:
change type, name, id, changes, age, key
added, Bailey, 3, , 1,
added, Elliot, 4, , 10,
removed, Pancakes, 2, , 2,
changed, , , "{'age': ['4', '5']}", , 1
因此并不适合所有更改,但对于添加/删除的行非常有效。
关于python - 如何比较2个不同的csv文件并输出差异,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64469479/