gpt4 book ai didi

hadoop - YARN Timeline Service v2 无法启动

转载 作者:可可西里 更新时间:2023-11-01 14:23:15 24 4
gpt4 key购买 nike

我在 AWS 上设置了一个测试 HDP 集群,用于评估一个项目。 Ambari UI 报告了一些错误,当我根据需要重新启动服务时,我遇到了 YARN 的问题。为 YARN 启动 Timeline Service Reader V2 时,出现错误

2018-08-10 15:51:06,400 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=15, retries=15, started=129034 ms ago, cancelled=false, msg=Call to HOSTNAME/IPADDRESS:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: HOSTNAME/IPADDRESS:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=HOSTNAME,17020,1533827052949, seqNum=-1

最终导致

stderr: 
Traceback (most recent call last):
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 982, in restart
self.status(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 88, in status
check_process_status(pid_file)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/check_process_status.py", line 43, in check_process_status
raise ComponentIsNotRunning()
ComponentIsNotRunning

The above exception was the cause of the following exception:

Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 108, in <module>
ApplicationTimelineReader().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
method(env)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 993, in restart
self.start(env, upgrade_type=upgrade_type)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 51, in start
hbase(action='start')
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 80, in hbase
createTables()
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 147, in createTables
logoutput=True)
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
returns=self.resource.returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 308, in _call
raise ExecuteTimeoutException(err_msg)
resource_management.core.exceptions.ExecuteTimeoutException: Execution of 'ambari-sudo.sh su yarn-ats -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent'"'"' ; sleep 10;export HBASE_CLASSPATH_PREFIX=/usr/hdp/3.0.0.0-1634/hadoop-yarn/timelineservice/*; /usr/hdp/3.0.0.0-1634/hbase/bin/hbase --config /usr/hdp/3.0.0.0-1634/hadoop/conf/embedded-yarn-ats-hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s'' was killed due timeout after 300 seconds

哪个组件需要重启才能使 YARN 恢复健康状态?将来调试问题的正确方法是什么?

最佳答案

如果您进入“后台操作”(Ambari UI 中的齿轮图标),然后转到 Timeline Service V2 启动链接(您可能必须首先单击运行 Timeline 服务的计算机才能到达那里) ,你应该在右上角有链接,上面写着“复制”和“打开”。这些将有望向您显示更详细的错误日志。

在我的例子中,Timeline Service V2 无法启动,因为系统内存不足。这是一个小型虚拟机集群,每台机器上只有 2GB 内存。我通过更详细的错误日志发现它给出了内存不足的错误,所以当我将 VM 内存增加到 4GB 时,它能够运行。我最好的猜测是您在运行 Ambari UI 的主 NameNode 上的内存不足。似乎需要大约 4GB 以上的空间,具体取决于您在主 NameNode 上运行的服务数量。

关于hadoop - YARN Timeline Service v2 无法启动,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51790281/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com