gpt4 book ai didi

hadoop - 如何防止由于 reduce task 失败而导致 hadoop 失败

转载 作者:可可西里 更新时间:2023-11-01 16:09:05 24 4
gpt4 key购买 nike

我在 AWS EMR hadoop 2.2.0 版本中运行了一个 s3distcp 作业。在 3 次尝试后,作业保持失败, reducer 任务失败。我也都试过了:

mapred.max.reduce.failures.percent
mapreduce.reduce.failures.maxpercent

要 50 到 oozie hadoop 操作配置和 mapred-site.xml。但作业仍然失败。

这是日志:

2015-10-02 14:42:16,001 INFO [main] org.apache.hadoop.mapreduce.Job: Task Id : attempt_1443541526464_0115_r_000010_2, Status : FAILED 2015-10-02 14:42:17,005 INFO [main] org.apache.hadoop.mapreduce.Job: map 100% reduce 93% 2015-10-02 14:42:29,048 INFO [main] org.apache.hadoop.mapreduce.Job: map 100% reduce 98% 2015-10-02 15:04:20,369 INFO [main] org.apache.hadoop.mapreduce.Job: map 100% reduce 100% 2015-10-02 15:04:21,378 INFO [main] org.apache.hadoop.mapreduce.Job: Job job_1443541526464_0115 failed with state FAILED due to: Task failed task_1443541526464_0115_r_000010 Job failed as tasks failed. failedMaps:0 failedReduces:1

2015-10-02 15:04:21,451 INFO [main] org.apache.hadoop.mapreduce.Job: Counters: 45 File System Counters FILE: Number of bytes read=280 FILE: Number of bytes written=10512783 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=32185011 HDFS: Number of bytes written=0 HDFS: Number of read operations=170 HDFS: Number of large read operations=0 HDFS: Number of write operations=28 Job Counters Failed reduce tasks=4 Launched map tasks=32 Launched reduce tasks=18 Data-local map tasks=15 Rack-local map tasks=17 Total time spent by all maps in occupied slots (ms)=2652786 Total time spent by all reduces in occupied slots (ms)=65506584 Map-Reduce Framework Map input records=156810 Map output records=156810 Map output bytes=30892192 Map output materialized bytes=6583455 Input split bytes=3904 Combine input records=0 Combine output records=0 Reduce input groups=0 Reduce shuffle bytes=7168 Reduce input records=0 Reduce output records=0 Spilled Records=156810 Shuffled Maps =448 Failed Shuffles=0 Merged Map outputs=448 Failed Shuffles=0 Merged Map outputs=448 GC time elapsed (ms)=2524 CPU time spent (ms)=108250 Physical memory (bytes) snapshot=14838984704 Virtual memory (bytes) snapshot=106769969152 Total committed heap usage (bytes)=18048614400 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=32181107 File Output Format Counters Bytes Written=0 2015-10-02 15:04:21,451 INFO [main] com.amazon.external.elasticmapreduce.s3distcp.S3DistCp: Try to recursively delete hdfs:/tmp/218ad028-8035-4f97-b113-3cfea04502fc/tempspace 2015-10-02 15:04:21,515 INFO [main] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 2015-10-02 15:04:21,516 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate] 2015-10-02 15:04:21,554 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1443541526464_0114_m_000000_0 is done. And is in the process of committing 2015-10-02 15:04:21,570 INFO [main] org.apache.hadoop.mapred.Task: Task attempt_1443541526464_0114_m_000000_0 is allowed to commit now 2015-10-02 15:04:21,584 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_1443541526464_0114_m_000000_0' to hdfs://rnd2-emr-head.ec2.int$ 2015-10-02 15:04:21,598 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1443541526464_0114_m_000000_0' done. 2015-10-02 15:04:21,616 INFO [Thread-6] amazon.emr.metrics.MetricsSaver: Inside MetricsSaver Shutdown Hook

如有任何建议,我们将不胜感激。

最佳答案

您可以尝试清理 hdfs://tmp 目录吗?只需备份该目录,因为其他一些应用程序使用 tmp 目录,如果您遇到任何问题,您可以替换 tmp 目录。

关于hadoop - 如何防止由于 reduce task 失败而导致 hadoop 失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32910776/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com