gpt4 book ai didi

python - 如何分组和计算其他列。 Pandas

转载 作者:行者123 更新时间:2023-12-04 01:05:00 27 4
gpt4 key购买 nike

我总结了 col1 col2 col3 计数的数据框,在该计数上添加不同的权重

数据集是这样的


# Current result
col1 col2 col3 Count Weightage_count
---------------------------------------------
1: A S1 X110 2 2
2: A S1 X150 2 0.5
3: A S2 X212 2 1
4: A S2 X200 1 0.5
5: A S2 X211 1 0.25
6: B S3 X311 4 4
7: C S4 X222 3 1.5


data = {'Col1':['A','A','A','A','A','B','C'],
'Col2':['S1','S1','S2','S2','S2','S3','S4'],
'Col3':['X110','X150','X212','X200','X211','X311','X222'],
'Count': [2,2,2,1,1,4,3],
'Weightage_count':[2, 0.5, 1, 0.5, 0.25, 4, 1.5]}

df = pd.DataFrame(data)

想根据col1和col2计算结果。

  • 结果 =(Col1 和 Col2 的总 Weightage_count)/(Col1 和 Col2 的总计数)

预期结果。

    Col1  Col2  Result
-------------------
1 A S1 0.625
2 A S2 0.5
3 B S3 1
4 C S4 0.5

最佳答案

先聚合sum,然后聚合DataFrame.eval中的多列:

df = (df.groupby(['Col1','Col2'])
.sum()
.eval('Weightage_count / Count')
.reset_index(name='Result'))
print (df)
Col1 Col2 Result
0 A S1 0.6250
1 A S2 0.4375
2 B S3 1.0000
3 C S4 0.5000

或除以Series.divDataFrame.pop处理后删除列:

df = df.groupby(['Col1','Col2'], as_index=False)[['Count','Weightage_count']].sum()
df['new'] = df.pop('Weightage_count').div(df.pop('Count'))
print (df)
Col1 Col2 new
0 A S1 0.6250
1 A S2 0.4375
2 B S3 1.0000
3 C S4 0.5000

如果还需要列:

df = df.groupby(['Col1','Col2'])[['Count','Weightage_count']].sum()
df['new'] = df['Weightage_count'].div(df['Count'])
print (df)
Count Weightage_count new
Col1 Col2
A S1 4 2.50 0.6250
S2 4 1.75 0.4375
B S3 4 4.00 1.0000
C S4 3 1.50 0.5000

关于python - 如何分组和计算其他列。 Pandas ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66794452/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com