gpt4 book ai didi

unit-testing - 使用Zookeeper在hadoop minicluster中负载均衡hiveserver2

转载 作者:行者123 更新时间:2023-12-02 20:37:57 26 4
gpt4 key购买 nike

我们有大量的 hive 单元测试,它们在hadoop minicluster中运行。问题是它们按顺序运行,每个构建大约需要一个小时才能完成。我们想通过使用与Zookeeper负载均衡的多个hive server2并行化hive单元测试。

当使用连接字符串“ jdbc:hive2:// localhost:20103 / default ”直接连接到hiveserver2实例时,它将按预期工作。但是,当使用连接字符串“ jdbc:hive2:// localhost:22010 / default; serviceDiscoveryMode = zooKeeper; zooKeeperNamespace = hiveserver2 ”连接到zookeeper时,它将失败,并出现以下错误。

hadoop minicluster中的zookeeper是否能够进行负载平衡?

INFO: Connecting to : jdbc:hive2://localhost:22010/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2

java.sql.SQLException: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read HiveServer2 configs from ZooKeeper

at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:135)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read HiveServer2 configs from ZooKeeper
at org.apache.hive.jdbc.ZooKeeperHiveClientHelper.configureConnParams(ZooKeeperHiveClientHelper.java:80)
at org.apache.hive.jdbc.Utils.configureConnParams(Utils.java:505)
at org.apache.hive.jdbc.Utils.parseURL(Utils.java:425)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:133)
... 29 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hiveserver2
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1590)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:214)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:203)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:199)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:191)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:38)
at org.apache.hive.jdbc.ZooKeeperHiveClientHelper.configureConnParams(ZooKeeperHiveClientHelper.java:63)
... 32 more

使用的版本
<hive.version>1.2.1000.2.4.0.0-169</hive.version>
<hadoop.version>2.7.1.2.4.0.0-169</hadoop.version>
<minicluster.version>0.1.14</minicluster.version>

服务器配置
public HiveServerRunner() {

zookeeperLocalCluster = new ZookeeperLocalCluster.Builder()
.setPort(22010)
.setTempDir("embedded_zk")
.setZookeeperConnectionString("127.0.0.1:22010")
.setDeleteDataDirectoryOnClose(true)
.build();

hiveLocalMetaStore = new HiveLocalMetaStore.Builder()
.setHiveMetastoreHostname("localhost")
.setHiveMetastorePort(20102)
.setHiveMetastoreDerbyDbDir("metastore_db")
.setHiveScratchDir("hive_scratch_dir")
.setHiveWarehouseDir("warehouse_dir")
.setHiveConf(buildHiveConf())
.build();

hiveLocalServer2 = new HiveLocalServer2.Builder()
.setHiveServer2Hostname("localhost")
.setHiveServer2Port(20103)
.setHiveMetastoreHostname("localhost")
.setHiveMetastorePort(20102)
.setHiveMetastoreDerbyDbDir("metastore_db")
.setHiveScratchDir("hive_scratch_dir")
.setHiveWarehouseDir("warehouse_dir")
.setHiveConf(buildHiveConf())
.setZookeeperConnectionString("127.0.0.1:22010")
.build();
}

public static HiveConf buildHiveConf() {
HiveConf hiveConf = new HiveConf();
hiveConf.set("hive.txn.manager", "org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");
hiveConf.set("hive.compactor.initiator.on", "true");
hiveConf.set("hive.compactor.worker.threads", "5");
hiveConf.set("hive.root.logger", "DEBUG,console");
hiveConf.set("hadoop.bin.path", System.getenv("HADOOP_HOME") + "/bin/hadoop");
hiveConf.set("hive.exec.submit.local.task.via.child", "false");
hiveConf.set("hive.server2.support.dynamic.service.discovery", "true");
hiveConf.set("hive.zookeeper.quorum", "127.0.0.1:22010");
hiveConf.setIntVar("hive.metastore.connect.retries", 3);
System.setProperty("HADOOP_HOME", WindowsLibsUtils.getHadoopHome());
return hiveConf;
}

最佳答案

看起来Zookeeper确实在进行负载平衡,但是它将客户端的请求定向到可用的随机 HS2。

有关更多详细信息,请参见下面的链接

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_hadoop-high-availability/content/ha-hs2-service-discovery.html

关于unit-testing - 使用Zookeeper在hadoop minicluster中负载均衡hiveserver2,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50537948/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com