gpt4 book ai didi

Hadoop HA 设置 : not able to connect to zookeeper

转载 作者:可可西里 更新时间:2023-11-01 16:31:34 26 4
gpt4 key购买 nike

我正在按照以下文章尝试设置 Hadoop HA。

http://hashprompt.blogspot.in/2015/01/fully-distributed-hadoop-cluster.html

配置完成后,当我尝试运行时

hdfs zkfc -formatZK

我收到以下错误。

    15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client     environment:java.library.path=/opt/hadoop-2.6.0/lib/native
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:os.version=3.13.0-32-generic
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:user.name=huser
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/huser
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/hadoop-2.6.0/sbin
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=mo-4594ddc63.mo.sap.corp:2181,mo-6dd5bf8b8.mo.sap.corp:2181,mo-e7b2822cb.mo.sap.corp:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@4d9e68d0
15/03/30 12:18:14 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-4594ddc63.mo.sap.corp/10.97.155.65:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:14 INFO zookeeper.ClientCnxn: Socket connection established to mo-4594ddc63.mo.sap.corp/10.97.155.65:2181, initiating session
15/03/30 12:18:14 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
15/03/30 12:18:15 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-e7b2822cb.mo.sap.corp/10.97.136.84:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:15 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15/03/30 12:18:15 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-6dd5bf8b8.mo.sap.corp/10.97.156.12:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:15 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15/03/30 12:18:17 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-4594ddc63.mo.sap.corp/10.97.155.65:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:17 INFO zookeeper.ClientCnxn: Socket connection established to mo-4594ddc63.mo.sap.corp/10.97.155.65:2181, initiating session
15/03/30 12:18:17 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
15/03/30 12:18:17 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-e7b2822cb.mo.sap.corp/10.97.136.84:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:17 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15/03/30 12:18:18 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-6dd5bf8b8.mo.sap.corp/10.97.156.12:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:18 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15/03/30 12:18:19 ERROR ha.ActiveStandbyElector: Connection timed out: couldn't connect to ZooKeeper in 5000 milliseconds
15/03/30 12:18:19 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-4594ddc63.mo.sap.corp/10.97.155.65:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:19 INFO zookeeper.ClientCnxn: Socket connection established to mo-4594ddc63.mo.sap.corp/10.97.155.65:2181, initiating session
15/03/30 12:18:20 INFO zookeeper.ZooKeeper: Session: 0x0 closed
15/03/30 12:18:20 INFO zookeeper.ClientCnxn: EventThread shut down
15/03/30 12:18:20 FATAL ha.ZKFailoverController: Unable to start failover controller. Unable to connect to ZooKeeper quorum at mo-4594ddc63.mo.sap.corp:2181,mo-6dd5bf8b8.mo.sap.corp:2181,mo-e7b2822cb.mo.sap.corp:2181. Please check the configured value for ha.zookeeper.quorum and ensure that ZooKeeper is running.

安装 zookeeper 后(我遵循 http://rajsyrus.blogspot.sg/2014/04/configuring-hadoop-high-availability.html ),我在每个节点上启动了 zookeeper 服务

./zkServer.sh start

命令,但是当我使用

查看它的状态时
./zkServer.sh status

出现以下结果

JMX enabled by default
Using config: /home/huser/zookeeper-3.4.6/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.

这意味着它可能没有正常运行。

zoo.cfg 的内容

# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/huser/zookeeper/data/
dataLogDir=/home/huser/zookeeper/log/
server.1=mo-4594ddc63.mo.sap.corp:2888:3888
server.2=mo-6dd5bf8b8.mo.sap.corp:2888:3888
server.3=mo-e7b2822cb.mo.sap.corp:2888:3888
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

core-site.xml 的内容

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://auto-ha</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>mo-4594ddc63.mo.sap.corp:2181,mo-6dd5bf8b8.mo.sap.corp:2181,mo-e7b2822cb.mo.sap.corp.hadoop.lab:2181</value>
</property>
</configuration>

hdfs-site.xml 的内容

<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///hdfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>auto-ha</value>
</property>
<property>
<name>dfs.ha.namenodes.auto-ha</name>
<value>nn01,nn02</value>
</property>
<property>
<name>dfs.namenode.rpc-address.auto-ha.nn01</name>
<value>mo-4594ddc63.mo.sap.corp:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.auto-ha.nn01</name>
<value>mo-4594ddc63.mo.sap.corp:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.auto-ha.nn02</name>
<value>mo-6dd5bf8b8.mo.sap.corp:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.auto-ha.nn02</name>
<value>mo-6dd5bf8b8.mo.sap.corp:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://mo-4594ddc63.mo.sap.corp:8485;mo-6dd5bf8b8.mo.sap.corp:8485;mo-e7b2822cb.mo.sap.corp:8485/auto-ha</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hdfs/journalnode</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/huser/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.auto-ha</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>mo-4594ddc63.mo.sap.corp:2181,mo-6dd5bf8b8.mo.sap.corp:2181,mo-e7b2822cb.mo.sap.corp:2181</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.auto-ha</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
</configuration>

任何指向错误解决的指针都会有很大帮助。

问候,萨汗卡尔

编辑

在完成 Rajesh 在他的回答中提到的操作之后,它似乎工作正常,因为没有错误。但是,设置后,运行 PI 示例显示以下错误。

huser@mo-4594ddc63:~$ hadoop jar /opt/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 8 10000
Number of Maps = 8
Samples per Map = 10000
15/03/31 13:23:08 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/huser/QuasiMonteCarlo_1427808186022_1353266286/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/huser/QuasiMonteCarlo_1427808186022_1353266286/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
15/03/31 13:23:08 ERROR hdfs.DFSClient: Failed to close inode 16390
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/huser/QuasiMonteCarlo_1427808186022_1353266286/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

似乎数据节点没有运行!!关于可能是什么错误的任何指示!

EDIT2

重试几次后,我停止了一切并再次启动了所有节点。但现在似乎 namenode02 没有启动。当我运行命令 hdfs haadmin -getServiceState nn02 我收到此错误操作失败:从 mo-4594ddc63/10.97.155.65 调用 mo-6dd5bf8b8 连接异常失败:java.net.ConnectException:连接被拒绝;有关详细信息,请参阅:wiki.apache.org/hadoop/ConnectionRefused来自未连接的 NameNode02 的日志。

2015-03-30 12:58:04,837 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 8020, call org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 10.97.155.65:60502 Call#229 Retry#0: org.apache.hadoop.ipc.StandbyException: Operation category JOURNAL is not supported in state standby
2015-03-30 12:58:52,094 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode mo-4594ddc63.mo.sap.corp/10.97.155.65:8020
2015-03-30 12:58:52,103 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NN
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category JOURNAL is not supported in state standby
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1719)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1350)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:6336)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:933)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:139)
at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:11214)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy15.rollEditLog(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:145)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)

在Datanode中,我找到了这些日志

java.io.EOFException: End of File Exception between local host is: "mo-217e677f3.mo.sap.corp/10.97.168.28"; destination host is: "mo-4594ddc63.mo.sap.corp":8020; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy12.sendHeartbeat(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:139)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:582)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)

每个节点的/etc/hosts文件

10.97.156.12    localhost
10.97.156.12 mo-6dd5bf8b8.mo.sap.corp mo-6dd5bf8b8
10.97.155.65 mo-4594ddc63.mo.sap.corp
#10.97.156.12 mo-6dd5bf8b8.mo.sap.corp
10.97.136.84 mo-e7b2822cb.mo.sap.corp
10.97.168.28 mo-217e677f3.mo.sap.corp
10.97.157.82 mo-fd6fa7b57.mo.sap.corp

ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
::1 ip6-localhost ip6-loopback
fe00:: ip6-localnet
ff00:: ip6-mcastprefix

各节点操作系统:ubuntu 12.04

最佳答案

在 zoo.cfg 中改变这个:

server.1=mo-4594ddc63.mo.sap.corp:2888:3888
server.2=mo-6dd5bf8b8.mo.sap.corp:2888:3888
server.3=mo-e7b2822cb.mo.sap.corp:2888:3888

server.1=mo-4594ddc63.mo.sap.corp:2888:3888
server.2=mo-6dd5bf8b8.mo.sap.corp:2889:3889
server.3=mo-e7b2822cb.mo.sap.corp:2890:3890

现在启动zookeeper并检查状态。

关于Hadoop HA 设置 : not able to connect to zookeeper,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29346348/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com