gpt4 book ai didi

hadoop - hbase 区域服务器未与主服务器通信

转载 作者:可可西里 更新时间:2023-11-01 14:40:25 27 4
gpt4 key购买 nike

我正在尝试让 bhase 集群正常工作。两个主服务器和两个区域服务器。我的问题是 regionserver 提示告诉主人他们已经起来了。:

2016-07-01 16:10:21,879 WARN  [regionserver/nbd-hadoop-data1/153.77.130.27:60020] **regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying.**
2016-07-01 16:10:24,879 INFO [regionserver/nbd-hadoop-data1/153.77.130.27:60020] **regionserver.HRegionServer: reportForDuty to master=0.0.0.0,60000,1467381897236 with port=60020, startcode=1467382178755**
2016-07-01 16:10:24,879 DEBUG [regionserver/nbd-hadoop-data1/153.77.130.27:60020] ipc.AbstractRpcClient: Use SIMPLE authentication for service RegionServerStatusService, sasl=false
2016-07-01 16:10:24,880 DEBUG [regionserver/nbd-hadoop-data1/153.77.130.27:60020] ipc.AbstractRpcClient: Connecting to /0.0.0.0:60000
2016-07-01 16:10:24,880 WARN [regionserver/nbd-hadoop-data1/153.77.130.27:60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982)
at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2270)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:894)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)

奇怪的是它在 0.0.0.0 上打开端口:

主服务器正在等待区域服务器:

2016-07-01 16:08:43,495 INFO  [0.0.0.0:60000.activeMasterManager] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 220970 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.

但是当我停止 regionserver master(Zookeeper) 时发现 regionserver 已经离线了:

2016-07-01 16:55:25,124 WARN  [main-EventThread] zookeeper.RegionServerTracker: nbd-hadoop-data1,60020,1467384161702 is not online or isn't known to the master.The latter could be caused by a DNS misconfiguration.
2016-07-01 16:55:26,509 INFO [0.0.0.0:60000.activeMasterManager] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 3023984 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.

我的hbase集群配置是

153.77.130.29 nbd-hadoop-nn1 - zookeeper, hdfs, hbase master
153.77.130.30 nbd-hadoop-nn2 -zookeeper, hdfs, hbase master
153.77.130.22 nbd-service - zookeeper
153.77.130.27 nbd-hadoop-data1 hbase regionserver 1
153.77.130.28 nbd-hadoop-data2 hbase regionserver 2


所有机器都按以下方式设置了 **/etc/hosts**:

127.0.0.1       localhost       localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

127.0.0.1 nbd-hadoop-nn1
153.77.130.22 nbd-service
153.77.130.29 nbd-hadoop-nn1
153.77.130.30 nbd-hadoop-nn2
153.77.130.27 nbd-hadoop-data1
153.77.130.28 nbd-hadoop-data2

主服务器bhase-site.xml:

<property>
<name>hbase.master.port</name>
<value>60000</value>
</property>

<property>
<name>hbase.regionserver.global.memstore.lowerLimit</name>
<value>0.38</value>
</property>

<property>
<name>hbase.regionserver.global.memstore.upperLimit</name>
<value>0.4</value>
</property>

<property>
<name>hbase.regionserver.handler.count</name>
<value>60</value>
</property>

<property>
<name>hbase.regionserver.info.port</name>
<value>60030</value>
</property>

<property>
<name>hbase.regionserver.port</name>
<value>60020</value>
</property>

区域服务器bhase-site.xml:

 <property>
<name>hbase.master.info.port</name>
<value>60010</value>
</property>

<property>
<name>hbase.master.port</name>
<value>60000</value>
</property>

<property>
<name>hbase.regionserver.global.memstore.lowerLimit</name>
<value>0.38</value>
</property>

<property>
<name>hbase.regionserver.global.memstore.upperLimit</name>
<value>0.4</value>
</property>

<property>
<name>hbase.regionserver.handler.count</name>
<value>60</value>
</property>
<property>
<name>hbase.regionserver.port</name>
<value>60020</value>
</property>

<property>
<name>hbase.regionserver.info.port</name>
<value>60030</value>
</property>

netstat -ntlp 来自 Master server nbd-hadoop-nn1(在::: 正确显示打开端口 60000):

tcp        0      0 :::60000                    :::*                        LISTEN      30839/java

netstat -ntlp 来自 Region server nbd-hadoop-data1 显示端口 60020 绑定(bind)到本地主机。我认为这是问题的根源:

tcp        0      0 ::ffff:127.0.0.1:60020      :::*                        LISTEN      22858/java

我无法从主服务器 telnet nbd-hadoop-data1 60020 远程登录区域服务器的端口 60020 ** - 连接拒绝。这可能是问题的根源,但我不知道如何重新配置​​它。我在任何地方都找不到为什么区域服务器在 ::ffff:127.0.0.1:60020 打开端口。

非常感谢您的提示。如果您需要额外的日志或配置文件,我会提供。

最佳答案

问题已解决。问题是由我的/etc/hosts 文件 127.0.01 主机名中的环回引起的。

关于hadoop - hbase 区域服务器未与主服务器通信,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38180823/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com