gpt4 book ai didi

hadoop - YARN 提示 java.net.NoRouteToHostException : No route to host (Host unreachable)

转载 作者:可可西里 更新时间:2023-11-01 15:19:30 25 4
gpt4 key购买 nike

尝试在 HDP 3.1 集群上运行 h2o 并遇到似乎与 YARN 资源容量有关的错误...

[ml1user@HW04 h2o-3.26.0.1-hdp3.1]$ hadoop jar h2odriver.jar -nodes 3 -mapperXmx 10g
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 192.168.122.1]
[Possible callback IP address: 172.18.4.49]
[Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 172.18.4.49:46015
(You can override these with -driverif and -driverport/-driverportrange and/or specify external IP using -extdriverif.)
Memory Settings:
mapreduce.map.java.opts: -Xms10g -Xmx10g -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dlog4j.defaultInitOverride=true
Extra memory percent: 10
mapreduce.map.memory.mb: 11264
Hive driver not present, not generating token.
19/07/25 14:48:05 INFO client.RMProxy: Connecting to ResourceManager at hw01.ucera.local/172.18.4.46:8050
19/07/25 14:48:06 INFO client.AHSProxy: Connecting to Application History server at hw02.ucera.local/172.18.4.47:10200
19/07/25 14:48:07 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/ml1user/.staging/job_1564020515809_0006
19/07/25 14:48:08 INFO mapreduce.JobSubmitter: number of splits:3
19/07/25 14:48:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1564020515809_0006
19/07/25 14:48:08 INFO mapreduce.JobSubmitter: Executing with tokens: []
19/07/25 14:48:08 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/3.1.0.0-78/0/resource-types.xml
19/07/25 14:48:08 INFO impl.YarnClientImpl: Submitted application application_1564020515809_0006
19/07/25 14:48:08 INFO mapreduce.Job: The url to track the job: http://HW01.ucera.local:8088/proxy/application_1564020515809_0006/
Job name 'H2O_47159' submitted
JobTracker job ID is 'job_1564020515809_0006'
For YARN users, logs command is 'yarn logs -applicationId application_1564020515809_0006'
Waiting for H2O cluster to come up...
ERROR: Timed out waiting for H2O cluster to come up (120 seconds)
ERROR: (Try specifying the -timeout option to increase the waiting time limit)
Attempting to clean up hadoop job...
19/07/25 14:50:19 INFO impl.YarnClientImpl: Killed application application_1564020515809_0006
Killed.
19/07/25 14:50:23 INFO client.RMProxy: Connecting to ResourceManager at hw01.ucera.local/172.18.4.46:8050
19/07/25 14:50:23 INFO client.AHSProxy: Connecting to Application History server at hw02.ucera.local/172.18.4.47:10200

----- YARN cluster metrics -----
Number of YARN worker nodes: 3

----- Nodes -----
Node: http://HW03.ucera.local:8042 Rack: /default-rack, RUNNING, 0 containers used, 0.0 / 15.0 GB used, 0 / 3 vcores used
Node: http://HW04.ucera.local:8042 Rack: /default-rack, RUNNING, 0 containers used, 0.0 / 15.0 GB used, 0 / 3 vcores used
Node: http://HW02.ucera.local:8042 Rack: /default-rack, RUNNING, 0 containers used, 0.0 / 15.0 GB used, 0 / 3 vcores used

----- Queues -----
Queue name: default
Queue state: RUNNING
Current capacity: 0.00
Capacity: 1.00
Maximum capacity: 1.00
Application count: 0

Queue 'default' approximate utilization: 0.0 / 45.0 GB used, 0 / 9 vcores used

----------------------------------------------------------------------

ERROR: Unable to start any H2O nodes; please contact your YARN administrator.

A common cause for this is the requested container size (11.0 GB)
exceeds the following YARN settings:

yarn.nodemanager.resource.memory-mb
yarn.scheduler.maximum-allocation-mb

----------------------------------------------------------------------

For YARN users, logs command is 'yarn logs -applicationId application_1564020515809_0006'

查看 Ambari UI 中的 YARN 配置,找不到这些属性。但是检查 YARN 资源管理器 UI 中的 YARN 日志并检查一些被杀死的应用程序的日志,我看到似乎是无法访问的主机错误...

Container: container_e05_1564020515809_0006_02_000002 on HW03.ucera.local_45454_1564102219781
LogAggregationType: AGGREGATED
=============================================================================================
LogType:stderr
LogLastModifiedTime:Thu Jul 25 14:50:19 -1000 2019
LogLength:2203
LogContents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/usercache/ml1user/appcache/application_1564020515809_0006/filecache/10/job.jar/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapred.YarnChild).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
java.net.NoRouteToHostException: No route to host (Host unreachable)
at java.net.PlainSocketImpl.socketConnect(Native Method)
....
at java.net.Socket.<init>(Socket.java:211)
at water.hadoop.EmbeddedH2OConfig$BackgroundWriterThread.run(EmbeddedH2OConfig.java:38)

End of LogType:stderr
***********************************************************************

注意 “java.net.NoRouteToHostException:没有到主机的路由(主机无法访问)”。但是,我可以相互访问所有其他节点,并且它们都可以相互 ping 通,所以不确定这里发生了什么。有任何调试或修复建议吗?

最佳答案

我想我发现了问题,TLDR:firewalld(在 centos7 上运行的节点)仍在运行,什么时候应该在 HDP 集群上禁用

来自另一个社区post :

For Ambari to communicate during setup with the hosts it deploys to and manages, certain ports must be open and available. The easiest way to do this is to temporarily disable iptables, as follows:

systemctl disable firewalld

service firewalld stop

显然 iptablesfirewalld 需要在整个集群中禁用(可以找到支持文档 here ,我只在 Ambari 安装节点上禁用了它们)。在集群中停止这些服务后(我建议使用 clush ),能够顺利运行 yarn 作业。

关于hadoop - YARN 提示 java.net.NoRouteToHostException : No route to host (Host unreachable),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57211993/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com