= 0.19.2 must be installed; however, it was not found"?-6ren"> = 0.19.2 must be installed; however, it was not found"?-我使用 Spark 2.3.1 并想使用 toPandas() (使用 unique() )。 当我在 pyspark 中执行以下代码时: df.toPandas()['column_01'].uni-6ren">
gpt4 book ai didi

pandas - 如何修复 "ImportError: Pandas >= 0.19.2 must be installed; however, it was not found"?

转载 作者:行者123 更新时间:2023-12-04 01:17:36 34 4
gpt4 key购买 nike

我使用 Spark 2.3.1 并想使用 toPandas() (使用 unique() )。

当我在 pyspark 中执行以下代码时:

df.toPandas()['column_01'].unique()

我面临以下异常:
>>> df.toPandas()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/xxx/spark/python/pyspark/sql/dataframe.py", line 2075, in toPandas
require_minimum_pandas_version()
File "/Users/xxx/spark/python/pyspark/sql/utils.py", line 129, in require_minimum_pandas_version
"it was not found." % minimum_pandas_version)
ImportError: Pandas >= 0.19.2 must be installed; however, it was not found.

如何解决?

最佳答案

您需要安装 Pandas :pip install pandas .
此外,要获得唯一值,您无需转换为 Pandas 数据帧。您可以在 spark 数据框中实现这一点。
df.select('column_01').distinct()

关于pandas - 如何修复 "ImportError: Pandas >= 0.19.2 must be installed; however, it was not found"?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53807854/

34 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com