gpt4 book ai didi

configuration - Hive on Spark > Yarn 模式 > spark 配置 > 给 spark.master 赋什么值

转载 作者:行者123 更新时间:2023-12-04 02:17:46 26 4
gpt4 key购买 nike

我正在尝试使用我自己的自定义 serde 来使用 HiveQL(它与纯 Hive 一起正常工作)。我按照以下说明进行操作:https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

但是我对这部分很困惑:启动Spark集群(支持standalone和Spark on YARN)。根据我的理解,如果Spark以独立模式运行,我们只需要启动Spark集群。但是我打算在Yarn上运行Spark,是否需要启动Spark集群?我所做的是:我刚刚启动了 Hadoop Yarn,因为我真的不知道要为 spark.master 属性设置什么,所以我根本就没有设置它。可能由于此设置,我在运行使用我自己的 Serde 的 Hive 查询时收到错误消息:

2015-10-05 20:42:07,184 INFO  [main]: status.SparkJobMonitor (RemoteSparkJobMonitor.java:startMonitor(67)) - Job hasn't been submitted after 61s. Aborting it.

2015-10-05 20:42:07,184 ERROR [main]: status.SparkJobMonitor (SessionState.java:printError(960)) - Status: SENT
2015-10-05 20:42:07,184 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=SparkRunJob start=1444066866174 end=1444066927184 duration=61010 from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>
2015-10-05 20:42:07,300 ERROR [main]: ql.Driver (SessionState.java:printError(960)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
2015-10-05 20:42:07,300 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=Driver.execute start=1444066848958 end=1444066927300 duration=78342 from=org.apache.hadoop.hive.ql.Driver>

...

最后还有如下异常:

2015-10-05 20:42:16,658 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/10/05 20:42:16 INFO yarn.Client: Application report for application_1444066615793_0001 (state: ACCEPTED)
2015-10-05 20:42:17,337 WARN [main]: client.SparkClientImpl (SparkClientImpl.java:stop(154)) - Timed out shutting down remote driver, interrupting...
2015-10-05 20:42:17,337 WARN [Driver]: client.SparkClientImpl (SparkClientImpl.java:run(430)) - Waiting thread interrupted, killing child process.
2015-10-05 20:42:17,345 WARN [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(572)) - Error in redirector thread.
java.io.IOException: Stream closed
at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:272)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:154)
at java.io.BufferedReader.readLine(BufferedReader.java:317)
at java.io.BufferedReader.readLine(BufferedReader.java:382)
at org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:568)
at java.lang.Thread.run(Thread.java:745)
2015-10-05 20:42:17,371 INFO [Thread-15]: session.SparkSessionManagerImpl (SparkSessionManagerImpl.java:shutdown(146)) - Closing the session manager.

最佳答案

请尝试set spark.master=yarn-client;

关于configuration - Hive on Spark > Yarn 模式 > spark 配置 > 给 spark.master 赋什么值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32955128/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com