gpt4 book ai didi

python - 如何比较2个不同的csv文件并输出差异

转载 作者:行者123 更新时间:2023-12-04 08:43:04 24 4
gpt4 key购买 nike

我有 2 个 CSV,它们是 New.csv 和 Old.csv,它们有大约 1K 行和 10 列,其结构如下:
enter image description here
如果 new.csv 中有一个不在 old.csv 中的 longName(第一列),我希望将整个 new.csv 行附加到 changes.csv。
我开始这样做,但它根本不起作用:

def deltaFileMaker():
with open('Old.csv', 'r', encoding='utf-8') as t1, open('New.csv', 'r', encoding='utf-8') as t2:
fileone = t1.readlines()
filetwo = t2.readlines()

with open('changes.csv', 'w', encoding='utf-8') as outFile:
for line in filetwo:
if line not in fileone:
outFile.write(line)



deltaFileMaker()

我也尝试使用 csv-diff 但我找不到将其输出转换为 csv 文件的方法
更新
def deltaFileMaker():
from csv_diff import load_csv, compare
diff = compare(
load_csv(open("old.csv",encoding="utf8"), key="longName"),
load_csv(open("new.csv",encoding="utf8"), key="longName")
)

with open('changes.csv', 'w',encoding="utf8") as f:
w = csv.DictWriter(f, diff.keys())
w.writeheader()
w.writerow(diff)


deltaFileMaker()


这样做:
enter image description here

最佳答案

你看了吗csv-diff ?他们的 website有一个可能合适的例子:

from csv_diff import load_csv, compare
diff = compare(
load_csv(open("one.csv"), key="id"),
load_csv(open("two.csv"), key="id")
)
这应该返回 dict对象,您可以将其解析为 CSV 文件。要将 dict 解析为行,这是一个示例。注意:正确编写更改很困难,但这更像是一个概念验证 - 根据需要进行修改
from csv_diff import load_csv, compare
fro csv import DictWriter

# Get all the row headers across all the changes
headers = set({'change type'})
for key, vals in diff.items():
for val in vals: # Multiple of the same difference 'type'
headers = headers.union(set(val.keys()))

# Write changes to file
with open('changes.csv', 'w', encoding='utf-8') as fh:
w = DictWriter(fh, headers)
w.writeheader()
for key, changes in diff.items():
for val in changes: # Add each instance of this type of change
val.update({'change type': key}) # Add 'change type' data
w.writerow(val)
对于文件 one.csv :
id,     name, age
1, Cleo, 4
2, Pancakes, 2
two.csv :
id,   name, age
1, Cleo, 5
3, Bailey, 1
4, Elliot, 10
运行它会产生:
change type,     name, id,               changes, age, key
added, Bailey, 3, , 1,
added, Elliot, 4, , 10,
removed, Pancakes, 2, , 2,
changed, , , "{'age': ['4', '5']}", , 1
所以对所有更改都不是很好,但对于添加/删除的行来说效果很好。

关于python - 如何比较2个不同的csv文件并输出差异,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64469479/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com