gpt4 book ai didi

python - 在 csv 文件中标记重复项

转载 作者:行者123 更新时间:2023-11-28 20:54:24 25 4
gpt4 key购买 nike

我对下面示例中说明的问题感到困惑:

"ID","NAME","PHONE","REF","DISCARD"
1,"JOHN",12345,,
2,"PETER",6232,,
3,"JON",12345,,
4,"PETERSON",6232,,
5,"ALEX",7854,,
6,"JON",12345,,

我想检测“PHONE”列中的重复项,并使用“REF”列标记后续重复项,值指向第一项的“ID”,值“DISCARD”的值为“Yes”专栏

"ID","NAME","PHONE","REF","DISCARD"
1,"JOHN",12345,1,
2,"PETER",6232,2,
3,"JON",12345,1,"Yes"
4,"PETERSON",6232,2,"Yes"
5,"ALEX",7854,,
6,"JON",12345,1,"Yes"

那么,我该怎么做呢?我试过这段代码,但我的逻辑当然不对。

import csv
myfile = open("C:\Users\Eduardo\Documents\TEST2.csv", "rb")
myfile1 = open("C:\Users\Eduardo\Documents\TEST2.csv", "rb")

dest = csv.writer(open("C:\Users\Eduardo\Documents\TESTFIXED.csv", "wb"), dialect="excel")

reader = csv.reader(myfile)
verum = list(reader)
verum.sort(key=lambda x: x[2])
for i, row in enumerate(verum):
if row[2] == verum[i][2]:
verum[i][3] = row[0]

print verum

非常感谢您的指导和帮助。

最佳答案

在运行时,您唯一需要保留在内存中的是电话号码与其 ID 的映射。

map = {}
with open(r'c:\temp\input.csv', 'r') as fin:
reader = csv.reader(fin)
with open(r'c:\temp\output.csv', 'w') as fout:
writer = csv.writer(fout)
# omit this if the file has no header row
writer.writerow(next(reader))
for row in reader:
(id, name, phone, ref, discard) = row
if map.has_key(phone):
ref = map[phone]
discard = "YES"
else:
map[phone] = id
writer.writerow((id, name, phone, ref, discard))

关于python - 在 csv 文件中标记重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1733166/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com