gpt4 book ai didi

python - MRJob MR 分配给Dictionary 而不是Yield?

转载 作者:行者123 更新时间:2023-12-01 05:55:59 24 4
gpt4 key购买 nike

我是 MRJob 和 MR 的新手,我想知道 MRJob MR 的传统字数统计 python 示例:

from mrjob.job import MRJob

class MRWordCounter(MRJob):
def mapper(self, key, line):
for word in line.split():
yield word, 1

def reducer(self, word, occurrences):
yield word, sum(occurrences)

if __name__ == '__main__':
MRWordCounter.run()

是否可以将 word, sum(occurrences) 元组存储到字典中而不是生成它们,以便我以后可以访问它们?执行此操作的语法是什么?谢谢!

最佳答案

您可以简单地使用列表而不是 yield :

from mrjob.job import MRJob

class MRWordCounter(MRJob):
def mapper(self, key, line):
results = []
for word in line.split():
results.append((word, 1)) <-- Note that the list should append a tuple here.
return results

def reducer(self, word, occurrences):
yield word, sum(occurrences)

if __name__ == '__main__':
MRWordCounter.run()

关于python - MRJob MR 分配给Dictionary 而不是Yield?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12575831/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com