gpt4 book ai didi

hadoop - 如何在python中为Hadoop Map Reduce作业编写合并器和分区器?以及如何在Hadoop作业中调用它

转载 作者:行者123 更新时间:2023-12-02 21:59:50 26 4
gpt4 key购买 nike

如何在python中编写组合器和分区器作业,然后使用Hadoop Streaming调用它。

最佳答案

请看看Pydoop。我没有对此进行探讨,但是根据文档,

Pydoop Script enables you to write simple MapReduce programs for Hadoop with mapper and reducer functions in just a few lines of code. When Pydoop Script isn't enough, you can switch to the more complete Pydoop API, which provides the ability to implement a Python Partitioner, RecordReader, and RecordWriter. Pydoop might not be the best API for all Hadoop use cases, but its unique features make it suitable for specific scenarios and it is being actively improved.



Here是基于Python的 hadoop组合器的一个SO问题。

其他引用

Reference Link

GitHub Link

this link还提供了其他各种可用的hadoop-python框架的详细信息。

关于hadoop - 如何在python中为Hadoop Map Reduce作业编写合并器和分区器?以及如何在Hadoop作业中调用它,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28314410/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com