gpt4 book ai didi

hadoop - 在GCP中查找jar文件的路径

转载 作者:行者123 更新时间:2023-12-02 20:23:45 25 4
gpt4 key购买 nike

在Google File Platform中找到hadoop-streaming-1.2.1.jar jar文件的路径。

https://github.com/devangpatel01/TF-IDF-implementation-using-map-reduce-Hadoop-python-

我正在尝试使用hadoop在GCP上运行此mapreduce,但我无法找到hadoop-streaming-1.2.1.jar的路径。我尝试手动下载jar文件并将其上传到hadoop中,然后运行mapper1.py。但是我说路径错误是出错的。上面的程序在本地计算机上运行。如何编辑在GCP上运行的命令?

hadoop jar /home/kirthyodackal/hadoop-streaming-1.2.1.jar-输入hdfs:// cluster-29-m / input_prgs / input_prgs / input1 / 000000_0-输出hdfs:// cluster-29-m / input_prgs / input_prgs / output1-映射器hdfs://cluster-29-m/input_prgs/input_prgs/mapper1.py -reducer hdfs://cluster-29-m/input_prgs/input_prgs/reducer1.py

最佳答案

我使用了另一个Mapper-Reducer程序,可以运行mapreduce。我使用了https://github.com/SatishUC15/TFIDF-HadoopMapReduce#tfidf-hadoop中的代码,并在我的GCP集群上运行以下命令。

> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseOne.py /home/kirthyodackal/ReducerPhaseOne.py -mapper "python MapperPhaseOne.py" -reducer "python ReducerPhaseOne.py" -input hdfs://cluster-3299-m/mapinput/inputfile -output hdfs://cluster-3299-m/mappred1

> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseTwo.py /home/kirthyodackal/ReducerPhaseTwo.py -mapper "python MapperPhaseTwo.py" -reducer "python ReducerPhaseTwo.py" -input hdfs://cluster-3299-m/mappred1/part-00000 hdfs://cluster-3299-m/mappred1/part-00001 hdfs://cluster-3299-m/mappred1/part-00002 hdfs://cluster-3299-m/mappred1/part-00003 hdfs://cluster-3299-m/mappred1/part-00004 -output hdfs://cluster-3299-m/mappred2

> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseThree.py /home/kirthyodackal/ReducerPhaseThree.py -mapper "python MapperPhaseThree.py" -reducer "python ReducerPhaseThree.py" -input hdfs://cluster-3299-m/mappred2/part-00000 hdfs://cluster-3299-m/mappred2/part-00001 hdfs://cluster-3299-m/mappred2/part-00002 hdfs://cluster-3299-m/mappred2/part-00003 hdfs://cluster-3299-m/mappred2/part-00004 -output hdfs://cluster-3299-m/mappredf

以下链接概述了我如何在GCP上使用MapReduce。
https://github.com/kirthy21/Data-Analysis-Stack-Exchange-Hadoop-Pig-Hive-MapReduce-TFIDF

关于hadoop - 在GCP中查找jar文件的路径,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58683808/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com