gpt4 book ai didi

Python - 在 csv 文件中显示具有重复值的行

转载 作者:太空宇宙 更新时间:2023-11-03 15:12:37 26 4
gpt4 key购买 nike

我有一个包含多列的 .csv 文件,其中一列填充了随机数,我想在那里找到重复的值。如果有 - 奇怪的情况,但这毕竟是我想要检查的 - 我想显示/存储存储这些值的完整行。

为了说清楚,我有这样的东西:

First, Whatever, 230, Whichever, etc
Second, Whatever, 11, Whichever, etc
Third, Whatever, 46, Whichever, etc
Fourth, Whatever, 18, Whichever, etc
Fifth, Whatever, 14, Whichever, etc
Sixth, Whatever, 48, Whichever, etc
Seventh, Whatever, 91, Whichever, etc
Eighth, Whatever, 18, Whichever, etc
Ninth, Whatever, 67, Whichever, etc

我想要:

Fourth, Whatever, 18, Whichever, etc
Eighth, Whatever, 18, Whichever, etc

为了查找重复值,我将该列存储到字典中,然后计算每个键以发现它们出现了多少次。

import csv
from collections import Counter, defaultdict, OrderedDict

with open(file, 'rt') as inputfile:
data = csv.reader(inputfile)

seen = defaultdict(set)
counts = Counter(row[col_2] for row in data)

print "Numbers and times they appear: %s" % counts

我明白了

Counter({' 18 ': 2, ' 46 ': 1, ' 67 ': 1, ' 48 ': 1,...})

现在问题来了,因为我没有设法将 key 与重复链接起来并在以后计算它。如果我这样做

for value in counts:
if counts > 1:
print counts

我将只获取 key ,这不是我想要的和每个值(更不用说我不仅要打印那个,还要打印整行......)

基本上我在寻找一种做事的方式

If there's a repeated number:
print rows containing those number
else
print "No repetitions"

提前致谢。

最佳答案

试试这个可能对你有用。

entries = []
duplicate_entries = []
with open('in.txt', 'r') as my_file:
for line in my_file:
columns = line.strip().split(',')
if columns[2] not in entries:
entries.append(columns[2])
else:
duplicate_entries.append(columns[2])

if len(duplicate_entries) > 0:
with open('out.txt', 'w') as out_file:
with open('in.txt', 'r') as my_file:
for line in my_file:
columns = line.strip().split(',')
if columns[2] in duplicate_entries:
print line.strip()
out_file.write(line)
else:
print "No repetitions"

关于Python - 在 csv 文件中显示具有重复值的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24698217/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com