gpt4 book ai didi

python - Pandas 累计总和取决于其他列值

转载 作者:行者123 更新时间:2023-12-04 07:12:48 26 4
gpt4 key购买 nike

我有这样一个数据集

Date        Runner  Group   distance [km]
2021-01-01 Joe 1 7
2021-01-02 Jack 1 6
2021-01-03 Jess 1 9
2021-01-01 Paul 2 11
2021-01-02 Peter 2 12
2021-01-02 Sara 3 15
2021-01-03 Sarah 3 10

我想计算每组运行者的累计总和。

Date        Runner  Group   distance [km]   cum sum [km]
2021-01-01 Joe 1 7 7
2021-01-02 Jack 1 6 13
2021-01-03 Jess 1 9 22
2021-01-01 Paul 2 11 11
2021-01-02 Peter 2 12 23
2021-01-02 Sara 3 15 15
2021-01-03 Sarah 3 10 25

不幸的是,我不知道该怎么做,也没有在其他地方找到答案。有人可以给我提示吗?

import pandas as pd
import numpy as np

df = pd.DataFrame([['2021-01-01','Joe', 1, 7],
['2021-01-02',"Jack", 1, 6],
['2021-01-03',"Jess", 1, 9],
['2021-01-01',"Paul", 2, 11],
['2021-01-02',"Peter", 2, 12],
['2021-01-02',"Sara", 3, 15],
['2021-01-03',"Sarah", 3, 10]],
columns=['Date','Runner', 'Group', 'distance [km]'])

最佳答案

尝试groupby cumsum:

>>> df['cum sum [km]'] = df.groupby('Group')['distance [km]'].cumsum()
>>> df
Date Runner Group distance [km] cum sum [km]
0 2021-01-01 Joe 1 7 7
1 2021-01-02 Jack 1 6 13
2 2021-01-03 Jess 1 9 22
3 2021-01-01 Paul 2 11 11
4 2021-01-02 Peter 2 12 23
5 2021-01-02 Sara 3 15 15
6 2021-01-03 Sarah 3 10 25
>>>

关于python - Pandas 累计总和取决于其他列值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68969322/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com