gpt4 book ai didi

hadoop - Map和Reduce各自执行100%,但流作业失败。 python

转载 作者:行者123 更新时间:2023-12-02 21:28:45 24 4
gpt4 key购买 nike

我正在运行使用map reduce的图遍历算法,并且在不使用hadoop的情况下进行测试时,它可以提供所需的输出。但是在运行命令时:

hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -file /home/hduser/finalmap.py -mapper 'python finalmap.py' -file /home/hduser/finalred.py -reducer 'python finalred.py' -input /Random_Walk_Input -output Random_Walk_Output1

发生以下情况:
16/01/27 11:03:51 INFO mapreduce.Job: map 0% reduce 0%
16/01/27 11:03:55 INFO mapreduce.Job: map 33% reduce 0%
16/01/27 11:04:02 INFO mapreduce.Job: Task Id : attempt_1453872707553_0001_m_000001_1, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

16/01/27 11:04:03 INFO mapreduce.Job: map 50% reduce 0%

16/01/27 11:04:14 INFO mapreduce.Job: Task Id : attempt_1453872707553_0001_m_000001_2, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

16/01/27 11:04:22 INFO mapreduce.Job: map 50% reduce 17%

16/01/27 11:04:25 INFO mapreduce.Job: map 100% reduce 100%

16/01/27 11:04:26 INFO mapreduce.Job: Job job_1453872707553_0001 failed with state FAILED due to: Task failed task_1453872707553_0001_m_000001

Job failed as tasks failed. failedMaps:1 failedReduces:0

16/01/27 11:04:27 INFO mapreduce.Job: Counters: 39 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=15725173 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=413787 HDFS: Number of bytes written=0 HDFS: Number of read operations=3 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed map tasks=4 Killed reduce tasks=1 Launched map tasks=5 Launched reduce tasks=1 Other local map tasks=3 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=68482 Total time spent by all reduces in occupied slots (ms)=19382 Total time spent by all map tasks (ms)=68482 Total time spent by all reduce tasks (ms)=19382 Total vcore-seconds taken by all map tasks=68482 Total vcore-seconds taken by all reduce tasks=19382 Total megabyte-seconds taken by all map tasks=70125568 Total megabyte-seconds taken by all reduce tasks=19847168 Map-Reduce Framework Map input records=17666 Map output records=767145 Map output bytes=14081829 Map output materialized bytes=15616125 Input split bytes=91 Combine input records=0 Spilled Records=767145 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=229 CPU time spent (ms)=17120 Physical memory (bytes) snapshot=269684736 Virtual memory (bytes) snapshot=852369408 Total committed heap usage (bytes)=200802304 File Input Format Counters Bytes Read=413696 16/01/27 11:04:27 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed!

这是什么意思?它显示了mapper和reducer各自执行了100%,但再次表示失败的映射:1和失败的reduce:0

最佳答案

确保您的流jar版本和hadoop版本匹配(它们具有相同的版本号)
这为我修复了错误!

关于hadoop - Map和Reduce各自执行100%,但流作业失败。 python ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35029961/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com