hadoop - hadoop mapreduce example 有时可以工作，有时会失败，这是怎么回事？-6ren

hadoop - hadoop mapreduce example 有时可以工作，有时会失败，这是怎么回事？

转载作者：可可西里更新时间：2023-11-01 15:48:55

我通过命令运行了一个 hadoop mapreduce 示例

hadoop jar hadoop-mapreduce-examples-2.7.1.jar wordcount input output

有时它起作用了:

18/11/06 00:37:06 INFO client.RMProxy: Connecting to ResourceManager at node-0/10.10.1.1:8032
18/11/06 00:37:06 INFO input.FileInputFormat: Total input paths to process : 1
18/11/06 00:37:06 INFO mapreduce.JobSubmitter: number of splits:1
18/11/06 00:37:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1541484532513_0006
18/11/06 00:37:06 INFO impl.YarnClientImpl: Submitted application application_1541484532513_0006
18/11/06 00:37:06 INFO mapreduce.Job: The url to track the job: http://node-0:8088/proxy/application_1541484532513_0006/
18/11/06 00:37:06 INFO mapreduce.Job: Running job: job_1541484532513_0006
18/11/06 00:37:11 INFO mapreduce.Job: Job job_1541484532513_0006 running in uber mode : false
18/11/06 00:37:11 INFO mapreduce.Job:  map 0% reduce 0%
18/11/06 00:37:15 INFO mapreduce.Job:  map 100% reduce 0%
18/11/06 00:37:18 INFO mapreduce.Job:  map 100% reduce 100%
18/11/06 00:37:18 INFO mapreduce.Job: Job job_1541484532513_0006 completed successfully
18/11/06 00:37:18 INFO mapreduce.Job: Counters: 44
    File System Counters
        FILE: Number of bytes read=216
        FILE: Number of bytes written=231641
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Rack-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=1300
        Total time spent by all reduces in occupied slots (ms)=1265
        Total time spent by all map tasks (ms)=1300
        Total time spent by all reduce tasks (ms)=1265
        Total vcore-seconds taken by all map tasks=1300
        Total vcore-seconds taken by all reduce tasks=1265
        Total megabyte-seconds taken by all map tasks=1331200
        Total megabyte-seconds taken by all reduce tasks=1295360
    Map-Reduce Framework
        Map input records=1
        Map output records=2
        Map output bytes=20
        Map output materialized bytes=30
        Input split bytes=135
        Combine input records=2
        Combine output records=2
        Reduce input groups=2
        Reduce shuffle bytes=30
        Reduce input records=2
        Reduce output records=2
        Spilled Records=4
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=14
        CPU time spent (ms)=660
        Physical memory (bytes) snapshot=402006016
        Virtual memory (bytes) snapshot=4040646656
        Total committed heap usage (bytes)=402653184
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=32
    File Output Format Counters 
        Bytes Written=28

或日志可能如下:

18/11/06 00:35:17 INFO mapreduce.Job: Task Id : attempt_1541484532513_0003_m_000000_1, Status : FAILED
File file:/tmp/hadoop-yarn/staging/suqiang/.staging/job_1541484532513_0003/job.jar does not exist
java.io.FileNotFoundException: File file:/tmp/hadoop-yarn/staging/suqiang/.staging/job_1541484532513_0003/job.jar does not exist
    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596)
    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
    at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
    at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)


18/11/06 00:35:21 INFO mapreduce.Job: Task Id : attempt_1541484532513_0003_m_000000_2, Status : FAILED
File file:/tmp/hadoop-yarn/staging/suqiang/.staging/job_1541484532513_0003/job.jar does not exist
java.io.FileNotFoundException: File file:/tmp/hadoop-yarn/staging/suqiang/.staging/job_1541484532513_0003/job.jar does not exist
    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596)
    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
    at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
    at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)


18/11/06 00:35:25 INFO mapreduce.Job:  map 100% reduce 0%
18/11/06 00:35:29 INFO mapreduce.Job:  map 100% reduce 100%
18/11/06 00:35:29 INFO mapreduce.Job: Job job_1541484532513_0003 completed successfully
18/11/06 00:35:29 INFO mapreduce.Job: Counters: 46
    File System Counters
        FILE: Number of bytes read=216
        FILE: Number of bytes written=231635
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Job Counters 
        Failed map tasks=3
        Launched map tasks=4
        Launched reduce tasks=1
        Other local map tasks=3
        Rack-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=6266
        Total time spent by all reduces in occupied slots (ms)=1290
        Total time spent by all map tasks (ms)=6266
        Total time spent by all reduce tasks (ms)=1290
        Total vcore-seconds taken by all map tasks=6266
        Total vcore-seconds taken by all reduce tasks=1290
        Total megabyte-seconds taken by all map tasks=6416384
        Total megabyte-seconds taken by all reduce tasks=1320960
    Map-Reduce Framework
        Map input records=1
        Map output records=2
        Map output bytes=20
        Map output materialized bytes=30
        Input split bytes=135
        Combine input records=2
        Combine output records=2
        Reduce input groups=2
        Reduce shuffle bytes=30
        Reduce input records=2
        Reduce output records=2
        Spilled Records=4
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=14
        CPU time spent (ms)=680
        Physical memory (bytes) snapshot=404619264
        Virtual memory (bytes) snapshot=4036009984
        Total committed heap usage (bytes)=402653184
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=32
    File Output Format Counters 
        Bytes Written=28

这很奇怪!它应该与这样的日志一起工作!它说 job.jar 不存在。

但有时，它会失败，同样的操作。

18/11/06 00:36:41 INFO mapreduce.Job: Task Id : attempt_1541484532513_0005_r_000000_1, Status : FAILED
File file:/tmp/hadoop-yarn/staging/suqiang/.staging/job_15414845
18/11/06 00:36:46 INFO mapreduce.Job: Task Id : attempt_1541484532513_0005_r_000000_2, Status : FAILED
File file:/tmp/hadoop-yarn/staging/suqiang/.staging/job_1541484532513_0005/job.jar does not exist
java.io.FileNotFoundException: File file:/tmp/hadoop-yarn/staging/suqiang/.staging/job_1541484532513_0005/job.jar does not exist
    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596)
    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
    at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
    at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)


18/11/06 00:36:52 INFO mapreduce.Job:  map 100% reduce 100%
18/11/06 00:36:52 INFO mapreduce.Job: Job job_1541484532513_0005 failed with state FAILED due to: Task failed task_1541484532513_0005_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1

18/11/06 00:36:52 INFO mapreduce.Job: Counters: 35
    File System Counters
        FILE: Number of bytes read=186
        FILE: Number of bytes written=115831
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Job Counters 
        Failed map tasks=1
        Failed reduce tasks=4
        Launched map tasks=2
        Launched reduce tasks=4
        Other local map tasks=1
        Rack-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=2217
        Total time spent by all reduces in occupied slots (ms)=8012
        Total time spent by all map tasks (ms)=2217
        Total time spent by all reduce tasks (ms)=8012
        Total vcore-seconds taken by all map tasks=2217
        Total vcore-seconds taken by all reduce tasks=8012
        Total megabyte-seconds taken by all map tasks=2270208
        Total megabyte-seconds taken by all reduce tasks=8204288
    Map-Reduce Framework
        Map input records=1
        Map output records=2
        Map output bytes=20
        Map output materialized bytes=30
        Input split bytes=135
        Combine input records=2
        Combine output records=2
        Spilled Records=2
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=7
        CPU time spent (ms)=250
        Physical memory (bytes) snapshot=252555264
        Virtual memory (bytes) snapshot=2014208000
        Total committed heap usage (bytes)=201326592
    File Input Format Counters 
        Bytes Read=32

我的实验发生了什么？是我操作失误还是hadoop实例自身问题？有没有人遇到过同样的问题？任何建议和解决方案将不胜感激。

最佳答案

由于您的作业在 uber 模式下失败，问题在于 Application master 无法访问 HDFS 或 HDFS 中的那些文件夹。

虽然我们找到了您问题的真正解决方案，但您可以像这样为您的工作禁用 uber 模式:

hadoop jar hadoop-mapreduce-examples-2.7.1.jar -D mapreduce.job.ubertask.enable=false wordcount 输入输出

要彻底解决这个问题，首先要清除您的 ApplicationMaster AM 配置。

编辑:也许您的问题在 /etc/hosts 中。你能在两台机器上打印它们的内容吗？也许您没有从 10.10.1.2 到 10.10.1.2 机器上的 localhost 的映射。

关于hadoop - hadoop mapreduce example 有时可以工作，有时会失败，这是怎么回事？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53167927/

文章推荐： c++ - Visual Studio 想要使用错误的平台工具集进行构建

文章推荐： c++ - Matlab Mex 库生命周期

文章推荐： mysql - Sqoop 语法错误，意外的 tIdentifier

c++ - 我是否需要在下次转移时将所有权*回*转移到转移队列？
我打算使用 vulkan synchronization examples 之一作为如何处理不经常更新的统一缓冲区的引用。具体来说，我正在看这个: vkBeginCommandBuffer(...);
git - 将分支和子分支 merge 回 master
我对 git 的了解有限。我已经从 master 创建了一个分支 B1，进行了一些编辑并提交到这个分支。我想从 B1 创建另一个分支 B2，我在 B2 中进行了一些编辑而且我还想提交 B2(包含
git - 如何将更改 merge 回 HEAD
这是我做的我创建了一个分支 abc。然后我创建了两个文本文件 one.txt 和 two.txt。然后我将它们提交到分支 abc。然后我从分支中删除文件 one.txt 并将这些更改提交到分支。现
git - 将已删除文件的分支 merge 回 master
在我的主分支中，我得到了 2 个文件: file1.txt file2.txt 我从那里创建了名为 b1 的新分支。在b1中，我修改了file2.txt，不小心删除了file1.txt(从磁盘中，当我
Git 基础 - merge 回 master
我是 git 的新手。我创建了一个分支，进行了更改，现在我想 merge 回 master 以使它们“永久化”。所以我执行了 git merge 1.2 报告为已经是最新的，在 master 上执
Git - 将 master merge 回 develop？
我在一个新团队中，工作方式与我以前习惯的完全不同，我们在功能分支上工作，测试人员会在该功能分支上进行测试，然后我们会运行一个 jenkins 作业在该功能被测试签署时将该功能 merge 到开发中，根
C - realloc 不反射(reflect)回 main
我目前正在学习动态内存管理是如何工作的，更具体地说是 realloc 以及它是如何在函数中完成的。在下面的程序中，我只是想尝试使用 malloc 在函数 a() 中分配一些数字，然后将它们传递给另一
java - 你如何分派(dispatch)回 Java 中的主线程？
在 Java 中如何从另一个线程分派(dispatch)回主 UI 线程？我正在使用带有 Runnable 的执行器在主 UI 线程之外做一些工作，并且我有一个接口(interface)，以便可以通过
Git:将一个新的提交 merge 回 master，这是针对一个非常旧的提交
我在 git 中有一个项目，所有的事情都直接在 master 分支上完成，标签被用来标记代码的发布版本。我知道这并不理想，并且一直在查看 git 流程，例如:http://nvie.com/posts
Git:如何找到所有从未 merge 回 master 的分支
我们有一个相当大的 GIT 存储库，我想删除从未 merge 回 master 的分支。反过来也很好 - 一种列出在某个时候已 merge 到 master 中的所有分支的方法。我希望首先获取一个
c - 将字符串从 Swift 传递到 C 回 Swift
在 Swift 和 C 之间传递字符串时，我看到一些我不理解的行为。请考虑以下 Swift 函数: func demo() { print("\n\n\n\n")
git - 如何在不丢失我在 github 上的工作的情况下 merge 回 master？
我以前从未合作过，现在我发现自己需要与其他一些人分享这个项目，即使我将完成 90% 的开发工作。我在 github 上有一个私有(private)仓库。我用推送了我的初始源 git push or
git - 是否应该将 "merge commit" merge 回 dev 分支？
我们的项目使用 Gitlab，我们有两个长期存在的分支:dev 和 master，类似于 Git Flow。我们正在使用“merge 提交”方法，它将在主分支中创建一个 merge 提交。但是，由于
cocoa - 将 NSArrayController 绑定(bind)回 ivar 时出现问题
我对自定义 View 的绑定(bind)属性有疑问。该属性绑定(bind)到核心数据实体的 NSArrayController。问题是这样的: 在我看来，我画了几个矩形。这些矩形的位置保存在核心数据
java - 将 TreeMap.Submap 返回 : SortedMap, 回 TreeMap
这对我来说似乎太棘手，无法正确执行此操作。我有一个TreeMap ，我正在获取其中的子图: public static reqObj assignObj(reqObj vArg, i
Silverlight - 从 DataGrid 列绑定(bind)回 View 模型的根属性？
我有以下 XAML: 所以，基本上我希望将其中一
javascript - AngularJS:ui-select 将数据绑定(bind)回 select
我正在使用 Angular js 1.3.4 版本并使用 ui-select。我正在将复杂的多级 JSON 对象数组绑定(bind)到此 ui-select，它工作正常。因此用户可以在此选择中选择任
c# - 将所有 NLog 日志绑定(bind)回 WebAPI 中的原始请求的方法？
我正在使用 WebAPI 构建 API，并且一直在使用 NLog 在整个堆栈中进行日志记录。我的 API 解决方案有两个主要项目，包括: 实现 Controller 和 webapi 东西的网站层本身
Git:如何找到分支 A 中源自派生分支 B 并 merge 回 A 的所有提交？
在 Git 中，给定 (1) 一个分支 A 和 (2) 一个在过去某个时间从 A 派生的分支 B，然后 merge 回 A，我如何才能找到现在 A 中起源于 B 的所有提交？目的是确定现在在 A 中
java - Struts2如何将Set
从 View 绑定(bind)回 Controller
假设我的 Controller 如下所示: public class myController { private MyCustomItem acte; ... // gett
可可西里

个人简介
我是一名优秀的程序员,十分优秀！
作者热门文章

android - RelativeLayout 背景可绘制重叠内容

android - 如何链接 cpufeatures lib 以获取 native android 库？

java - OnItemClickListener 不起作用，但 OnLongItemClickListener 在自定义 ListView 中起作用

java - Android 文件转字符串
滴滴打车优惠券免费领取
全站热门文章

新手入门Java自动化测试的利器：SeleniumWebDriver

TinyVuev3.19.0正式发布！Tree组件终于支持虚拟滚动啦！UI也升级啦，更更符合现代审美~

鸿蒙NEXT开发案例：转盘

用Java实现samza转换成flink

利用Screen保持VSCode连接远程任务持续运行

Borůvka算法

汉文博士词典编译配置文件概述

openEuler搭建k8s(1.28.2版本)

Nuxt.js应用中的listen事件钩子详解

vue通过ollama接口调用开源模型
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
 广告合作:1813099741@qq.com 6ren.com

首页

博学

6Ren·AI

商城

hadoop - hadoop mapreduce example 有时可以工作，有时会失败，这是怎么回事？