gpt4 book ai didi

python - Pyspark 抛出 IllegalArgumentException : 'Unsupported class file major version 55' when trying to use udf

转载 作者:行者123 更新时间:2023-12-01 07:45:58 38 4
gpt4 key购买 nike

我在 pyspark 中使用 udfs 时遇到以下问题。

只要我不使用任何 udf,我的代码就可以正常工作。执行选择列等简单操作或使用 concat 等 sql 函数没有问题。一旦我对使用 udf 的 DataFrame 执行操作,程序就会崩溃,并出现以下异常:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/Users/szymonk/Desktop/Projects/SparkTest/venv/lib/python2.7/site-packages/pyspark/jars/spark-unsafe_2.11-2.4.3.jar) to method java.nio.Bits.unaligned()
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
19/06/05 09:24:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Traceback (most recent call last):
File "/Users/szymonk/Desktop/Projects/SparkTest/Application.py", line 59, in <module>
transformations.select(udf_example(col("gender")).alias("udf_example")).show()
File "/Users/szymonk/Desktop/Projects/SparkTest/venv/lib/python2.7/site-packages/pyspark/sql/dataframe.py", line 378, in show
print(self._jdf.showString(n, 20, vertical))
File "/Users/szymonk/Desktop/Projects/SparkTest/venv/lib/python2.7/site-packages/py4j/java_gateway.py", line 1257, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/Users/szymonk/Desktop/Projects/SparkTest/venv/lib/python2.7/site-packages/pyspark/sql/utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u'Unsupported class file major version 55'

我已尝试按照以下建议更改 JAVA_HOME: Pyspark error - Unsupported class file major version 55但这没有帮助。

我的代码中没有什么奇特的东西。我只定义一个简单的 udf 函数,该函数应返回“性别”列内值的长度

from pprint import pprint
from pyspark.sql import SparkSession, Column
from pyspark.sql.functions import col, lit, struct, array, udf, concat, trim, when
from pyspark.sql.types import IntegerType

transformations = spark.read.csv("Resources/PersonalData.csv", header=True)

udf_example = udf(lambda x: len(x))
transformations.select(udf_example(col("gender")).alias("udf_example")).show()

我不确定这是否重要,但我在 Mac 上使用 Pycharm。

最佳答案

我找到了解决方案,我必须切换 Pycharm 的启动 jdk (2xshift -> jdk -> 选择 jdk 1.8)

关于python - Pyspark 抛出 IllegalArgumentException : 'Unsupported class file major version 55' when trying to use udf,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56456077/

38 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com