gpt4 book ai didi

apache-spark - 为什么 pyspark 失败并显示 "Error while instantiating ' org.apache.spark.sql.hive.HiveSessionStateBuilder'”?

转载 作者:行者123 更新时间:2023-12-04 03:08:41 25 4
gpt4 key购买 nike

对于我的生活,我无法弄清楚我的 PySpark 安装有什么问题。我已经安装了所有依赖项,包括 Hadoop,但 PySpark 找不到它——我的诊断是否正确?

查看下面的完整错误消息,但它最终在 PySpark SQL 上失败

pyspark.sql.utils.IllegalArgumentException: u“实例化‘org.apache.spark.sql.hive.HiveSessionStateBuilder’时出错:”

nickeleres@Nicks-MBP:~$ pyspark
Python 2.7.10 (default, Feb 7 2017, 00:08:15)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/opt/spark-2.2.0/jars/hadoop-auth-2.7.3.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
17/10/24 21:21:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
Traceback (most recent call last):
File "/opt/spark/python/pyspark/shell.py", line 45, in <module>
spark = SparkSession.builder\
File "/opt/spark/python/pyspark/sql/session.py", line 179, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "/opt/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/opt/spark/python/pyspark/sql/utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':"
>>>

最佳答案

tl;dr 关闭所有其他 Spark 进程并重新开始。

以下 WARN 消息表示有另一个进程(或多个进程)持有这些端口。

我确定进程是 Spark 进程,例如pyspark session 或 Spark 应用程序。

17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.

这就是为什么在 Spark/pyspark 发现端口 4044 可免费用于 Web UI 后,它尝试实例化 HiveSessionStateBuilder 但失败了。

pyspark 失败,因为您不能启动和运行多个使用同一本地 Hive 元存储的 Spark 应用程序。

关于apache-spark - 为什么 pyspark 失败并显示 "Error while instantiating ' org.apache.spark.sql.hive.HiveSessionStateBuilder'”?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46924010/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com