gpt4 book ai didi

pandas - AWS EMR : 'JavaPackage' object is not callable 上带有 Pandas 和 pyarrow 错误的 pyspark

转载 作者:行者123 更新时间:2023-12-04 09:34:55 25 4
gpt4 key购买 nike

我正在尝试将 Pandas 数据帧转换为 Pyspark 数据帧,并收到以下与 pyarrow 相关的错误:

import pandas as pd
import numpy as np

data = np.random.rand(1000000, 10)
pdf = pd.DataFrame(data, columns=list("abcdefghij"))
df = spark.createDataFrame(pdf)
/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py:714: UserWarning: createDataFrame attempted Arrow optimization because 'spark.sql.execution.arrow.enabled' is set to true; however, failed by the reason below:
'JavaPackage' object is not callable
Attempting non-optimization as 'spark.sql.execution.arrow.fallback.enabled' is set to true.
我尝试了不同版本的 pyarrow(0.10.0、0.14.1、0.15.1 和更多),但结果相同。我该如何调试?

最佳答案

我遇到了同样的问题,将集群设置更改为 emr-5.30.1,将箭头版本更改为 0.14.1 并解决了问题

关于pandas - AWS EMR : 'JavaPackage' object is not callable 上带有 Pandas 和 pyarrow 错误的 pyspark,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62634104/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com