gpt4 book ai didi

dataframe - 如何让函数聚合 "ignore"列?

转载 作者:行者123 更新时间:2023-12-01 13:38:33 26 4
gpt4 key购买 nike

假设我有一个包含多个分类维度和一个“值”维度的数据框,我想按其中一些维度进行聚合,而忽略其他维度。

在 Julia DataFrames 中有聚合函数,但如果我给出一些分类值,我会得到一个错误,因为它试图将函数(这里是求和)也应用于它们,而不是仅仅忽略它们:

在:

using DataArrays, DataFrames
df = DataFrame(
colour = ["green","blue","white","green","green"],
shape = ["circle", "triangle", "square","square","circle"],
border = ["dotted", "line", "line", "line", "dotted"],
area = [1.1, 2.3, 3.1, 4.2, 5.2])

输出:

    colour  shape       border  area
1 green circle dotted 1.1
2 blue triangle line 2.3
3 white square line 3.1
4 green square line 4.2
5 green circle dotted 5.2

在:

aggregate(df,[:colour,:shape, :border],sum) # Ok
aggregate(df,[:colour,:shape],sum) # what I would like, ignoring border column

输出:

LoadError: MethodError: no method matching +(::String, ::String)

显然我可能只是在聚合之前删除了额外的列,但也许有一种方法可以在单个段落中完成它?

最佳答案

来自 https://juliastats.github.io/DataFrames.jl/split_apply_combine/

by(df, [:colour,:shape]) do df
DataFrame(m = sum(df[:area]))
end

关于dataframe - 如何让函数聚合 "ignore"列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42222217/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com