gpt4 book ai didi

python - 将具有相似行值的值相加

转载 作者:行者123 更新时间:2023-11-28 22:29:53 25 4
gpt4 key购买 nike

我有一个看起来像这样的 pandas 数据集

city    difference 
NY 6
SF 8
LA 8
NY 9
SF 10

我想根据 city 列对 difference 列的值求和,这样我的最终数据集看起来像

city    difference    total difference
NY 6 15
NY 9
LA 8 8
SF 10 10

我试过了

df['total difference'] = df.groupby('city')['difference'].sum()

但是没有用。我什至试过How to sum values of particular rows in pandas?但得到了新列的 NaN 值。请帮忙!

最佳答案

我想你需要transform :

df['total difference'] = df.groupby('city')['difference'].transform(sum) 
print (df)
city difference total difference
0 NY 6 15
1 SF 8 18
2 LA 8 8
3 NY 9 15
4 SF 10 18

如果还需要排序列:

df['total difference'] = df.groupby('city')['difference'].transform('sum') 
df = df.sort_values('city')
print (df)
city difference total difference
2 LA 8 8
0 NY 6 15
3 NY 9 15
1 SF 8 18
4 SF 10 18

我对功能和时间上的差异非常相似很感兴趣:

#[10000000 rows x 2 columns]
np.random.seed(100)
df = pd.DataFrame(np.random.randint(1000, size=(10000000,2)), columns=['city','difference'])
#print (df)

In [293]: %timeit (df.groupby('city')['difference'].transform('sum'))
1 loop, best of 3: 570 ms per loop

In [294]: %timeit (df.groupby('city')['difference'].transform(sum))
1 loop, best of 3: 567 ms per loop

In [295]: %timeit (df.groupby('city')['difference'].transform(np.sum))
1 loop, best of 3: 561 ms per loop

关于python - 将具有相似行值的值相加,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42766654/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com