gpt4 book ai didi

Python csv 阅读器 : how to pipe output to another script using command line

转载 作者:太空宇宙 更新时间:2023-11-04 01:19:35 27 4
gpt4 key购买 nike

我有 2 个脚本,一个映射器和一个缩减器。两者都从 csv 阅读器获取输入。映射器脚本应从制表符分隔的文本文件 dataset.csv 中获取输入,reducer 的输入应该是映射器的输出。我想将 reducer 的输出保存到文本文件 output.txt。执行此操作的正确命令链是什么?

映射器:

#/usr/bin/python

import sys, csv

reader = csv.reader(sys.stdin, delimiter='\t')
writer = csv.writer(sys.stdout, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

for line in reader:
if len(line) > 5: # parse only lines in the forum_node.tsv file
if line[5] == 'question':
_id = line[0]
student = line[3] # author_id
elif line[5] != 'node_type':
_id = line[7]
student = line[3] # author_id
else:
continue # ignore header

print '{0}\t{1}'.format(_id, student)

reducer :

#/usr/bin/python

import sys, csv

reader = csv.reader(sys.stdin, delimiter='\t')
writer = csv.writer(sys.stdout, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

oldID = None
students = []

for line in reader:
if len(line) != 2:
continue

thisID, thisStudent = data

if oldID and oldID != thisID:
print 'Thread: {0}, students: {1}'.format(oldID, ', '.join(students))
students = []

thisID = oldID
students.append(thisStudent)

if oldID != None:
print 'Thread: {0}, students: {1}'.format(oldID, ', '.join(students))

最佳答案

将文件通过管道传输到一起:

python mapper.py < dataset.csv | python reducer.py > output.txt

< dataset.csv给出 mapper.py stdin 上的 CSV 文件, 和 |将 stdout 重定向到另一个命令。另一个命令是 python reducer.py , 和 > output.txt连接 stdout从那个脚本到 `output.txt.

关于Python csv 阅读器 : how to pipe output to another script using command line,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22123475/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com