gpt4 book ai didi

python - 复杂的 Pandas 聚合后如何收集列值

转载 作者:行者123 更新时间:2023-11-28 17:16:12 24 4
gpt4 key购买 nike

我执行了一些非平凡的聚合,如下所示:

aggregations = {
'x_TmId': {
'Trays': 'nunique',
'Orderlines': 'count',
},
'x_Qty': 'sum'
}

newdf = pick.groupby(['Date','x_OrderId']).agg(aggregations).reset_index(True)

此时可以像往常一样调用聚合的数据框列

  newdf.columns

但这会返回一些我以前没有遇到过的东西:一个 MultiIndex 对象:

MultiIndex(levels=[['x_TmId', 'x_Qty', 'x_OrderId'], ['Orderlines', 'Trays', 'sum', '']],
labels=[[2, 0, 0, 1], [3, 0, 1, 2]])

在这一点上,我意识到我不知道如何调用新变量“sum”的例子? stackoverflow 上一定有类似的问题,但还没有找到。

最佳答案

我认为最简单的是 tuple for select MultiIndex在列中:

a = df[('x_Qty', 'sum')]

另一种解决方案 slicers :

idx = pd.IndexSlice
print (newdf.loc[:, idx['x_Qty', 'sum']])

但是对于 pandas 0.20.1 得到 警告:

FutureWarning: using a dict with renaming is deprecated and will be removed in a future version return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)

解决方案是重命名:

aggregations = {
'x_TmId': ['nunique', 'count'],
'x_Qty': 'sum'
}

newdf = pick.groupby(['Date','x_OrderId']).agg(aggregations).reset_index(True)
d = {'nunique':'Trays','count':'Orderlines'}
newdf = newdf.rename(columns=d)
print (newdf)
x_OrderId x_TmId x_Qty
Trays Orderlines sum
Date
2017-10-01 9 1 1 4
2017-10-02 4 1 1 1
2017-10-03 0 1 1 3
2017-10-04 1 1 1 6
2017-10-05 9 1 1 5
2017-10-06 0 1 1 3
2017-10-07 1 1 1 9
2017-10-08 8 1 1 6
2017-10-09 9 1 1 9
2017-10-10 0 1 1 1

但更简单的选择是删除列中的 MultiIndex:

aggregations = {
'x_TmId': ['nunique', 'count'],
'x_Qty': 'sum'
}

newdf = pick.groupby(['Date','x_OrderId']).agg(aggregations)
newdf.columns = newdf.columns.map('_'.join)
d = {'x_TmId_nunique':'Trays','x_TmId_count':'Orderlines'}
newdf = newdf.reset_index().rename(columns=d)
print (newdf)
Date x_OrderId Trays Orderlines x_Qty_sum
0 2017-10-01 9 1 1 4
1 2017-10-02 4 1 1 1
2 2017-10-03 0 1 1 3
3 2017-10-04 1 1 1 6
4 2017-10-05 9 1 1 5
5 2017-10-06 0 1 1 3
6 2017-10-07 1 1 1 9
7 2017-10-08 8 1 1 6
8 2017-10-09 9 1 1 9
9 2017-10-10 0 1 1 1

print (newdf['x_Qty_sum'])
0 4
1 1
2 3
3 6
4 5
5 3
6 9
7 6
8 9
9 1
Name: x_Qty_sum, dtype: int32

关于python - 复杂的 Pandas 聚合后如何收集列值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43954561/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com