gpt4 book ai didi

sum - 当 sum() 一列时,我收到此错误 AttributeError : 'DataFrame' object has no attribute 'sum'

转载 作者:行者123 更新时间:2023-12-03 09:09:17 28 4
gpt4 key购买 nike

我有一个像这样的数据框:

+-----+--------+
|count| country|
+-----+--------+
| 12| Ireland|
| 5|Thailand|
+-----+--------+

当我添加 sum() 函数来获取第一列“count”的总计时,我收到此错误:

 AttributeError: 'DataFrame' object has no attribute 'sum'

我确实导入了from pyspark.sql.functions import sum

我该如何求和或者我遗漏了什么?

谢谢您并感谢您的帮助。

最佳答案

>>> from pyspark.sql.functions import sum
>>> a = [(12,"Ireland"),(5,"Thailand")]
>>> df = spark.createDataFrame(a,["count","country"])
>>> df.show()
+-----+--------+
|count| country|
+-----+--------+
| 12| Ireland|
| 5|Thailand|
+-----+--------+

如您所见here :

groupBy(): Groups the DataFrame using the specified columns, so we can run aggregation on them. See GroupedData for all the available aggregate functions.

GroupedData您可以在 DataFrame 上找到一组聚合方法,例如 sum()avg()mean()

因此,在应用这些函数之前,您必须对数据进行分组。

>>> total = df.groupBy().sum()
>>> total.show()
+----------+
|sum(count)|
+----------+
| 17|
+----------+

参见here例如 sum()

关于sum - 当 sum() 一列时,我收到此错误 AttributeError : 'DataFrame' object has no attribute 'sum' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44248742/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com