gpt4 book ai didi

python - 根据列对文件进行排序并获取 uniq 元素

转载 作者:太空宇宙 更新时间:2023-11-03 18:04:05 26 4
gpt4 key购买 nike

我想根据文件的内容对原始文件进行排序,并获取该列中的唯一元素:

原始文件:

qoow_12_xx7_21  wer1    rwty3
asss_x17_211 aqe3 sda4
acyi_112_werxc xcu12 weqa1
qwer_234_ssd aqe3 wers

输出排序数据:

asss_x17_211    aqe3    sda4
qwer_234_ssd aqe3 wers
qoow_12_xx7_21 wer1 rwty3
acyi_112_werxc xcu12 weqa1

输出唯一的col2:

aqe3
wer1
xcu12

我的尝试不起作用代码:

from operator import itemgetter
import itemgetter


def get_unique(data):
seen=""
for e in data:
if e not in seen:
seen="\t".join(seen)
return seen

col2=""
with open("myfile.txt", "r") as infile, open("out.xls","w") as outfile:
for line in infile:
data=line.rstrip.split("\t")
sorted_data=sorted(data, key=lambda e: e.itemgetter)
col2="".join(data[1])
uniq_col2=get_unique(col2)
outfile.write(sorted_data)# tab-delimited sorted data
outfile.write(uniq_col2) # sorted column 2 data

有人可以帮助使此代码正常工作吗?谢谢

最佳答案

试试这个:

from operator import itemgetter

with open('test.txt') as infile, open('out.txt', 'w') as outfile:
# sort input by 2nd column
sorted_lines = sorted(
(line.strip().split() for line in infile),
key=itemgetter(1)
)

# output sorted input
for line in sorted_lines:
outfile.write('\t'.join(line))
outfile.write('\n')

# discard duplicates in already sorted sequence => uniq items
prev_item = None
for item in (line[1] for line in sorted_lines):
if item != prev_item:
prev_item = item
outfile.write(item)
outfile.write('\n')

关于python - 根据列对文件进行排序并获取 uniq 元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27197047/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com