gpt4 book ai didi

java - Hive 流和 Azure Data Lake Store 的问题

转载 作者:可可西里 更新时间:2023-11-01 14:55:31 41 4
gpt4 key购买 nike

我正在编写一个 Play2 Java Web 应用程序,以使用 Hive Streaming API ( https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest ) 将数据提取到 HDInsight 交互式查询。 Hive 数据存储在 Azure Data Lake Store 中。

我大致基于https://github.com/mradamlacey/hive-streaming-azure-hdinsight/blob/master/src/main/java/com/cbre/eim/HiveStreamingExample.java .

当我在一个头节点上运行代码时,我收到以下错误:

play.api.UnexpectedException: Unexpected exception[StreamingIOFailure: Failed creating RecordUpdaterS for adl://home/hive/warehouse/data/ingest_date=2018-05-07 txnIds[486,495]]
at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:251)
at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:182)
at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:343)
at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:341)
at scala.concurrent.Future.$anonfun$recoverWith$1(Future.scala:414)
at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:37)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:91)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
Caused by: org.apache.hive.hcatalog.streaming.StreamingIOFailure: Failed creating RecordUpdaterS for adl://home/hive/warehouse/data/ingest_date=2018-05-07 txnIds[486,495]
at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.newBatch(AbstractRecordWriter.java:166)
at org.apache.hive.hcatalog.streaming.StrictJsonWriter.newBatch(StrictJsonWriter.java:41)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:559)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:512)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:397)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:377)
at hive.HiveRepository.createMany(HiveRepository.java:76)
at controllers.HiveController.create(HiveController.java:40)
at router.Routes$$anonfun$routes$1.$anonfun$applyOrElse$2(Routes.scala:70)
at play.core.routing.HandlerInvokerFactory$$anon$4.resultCall(HandlerInvoker.scala:137)
Caused by: java.io.IOException: No FileSystem for scheme: adl
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2644)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.<init>(OrcRecordUpdater.java:233)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getRecordUpdater(OrcOutputFormat.java:292)
at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.createRecordUpdater(AbstractRecordWriter.java:226)

我在 Microsoft forum 上提出了问题以及 Hive jira .

我可以确认描述的 jar here存在于类路径中:

com.microsoft.azure.azure-data-lake-store-sdk-2.2.5.jar
org.apache.hadoop.hadoop-azure-datalake-3.1.0.jar

最佳答案

No FileSystem for scheme

the filesystem is not configured 时,您会收到此错误这可能需要在 HiveServer 和本地客户端的 core-site.xml 文件中完成

JAR 存在并不意味着它们已加载到类路径并配置为从您的 Azure 帐户读取

关于java - Hive 流和 Azure Data Lake Store 的问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50304352/

41 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com