gpt4 book ai didi

python - 如何计算累积总和直到达到阈值并在达到阈值后重置它考虑Python中的pandas数据帧中的组?

转载 作者:行者123 更新时间:2023-12-03 08:01:14 25 4
gpt4 key购买 nike

我有一个像这样的数据框:

import pandas as pd
import numpy as np

data={'trip':[1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3],
'timestamps':[1235471761, 1235471763, 1235471765, 1235471767, 1235471770, 1235471772, 1235471776, 1235471779, 1235471780, 1235471789,1235471792,1235471793,1235471829,1235471833,1235471835,1235471838,1235471844,1235471847,1235471848,1235471852,1235471855,1235471859,1235471900,1235471904,1235471911,1235471913]}

df = pd.DataFrame(data)
df['TimeDistance'] = df.groupby('trip')['timestamps'].diff(1)
df

我正在寻找的是从“TimeDistance”列中的第一行(将其视为原点)开始,对其值进行累积和,每当该总和达到 10 时,重新启动 cumsum 并继续此过程,直到行程结束(正如您在此数据框中所看到的,我们在“行程”列中有 3 次行程)。我想要一个新列中的所有累积总和,比如说“cumu”列。另一个重要问题是,达到阈值后,“cumu”列中阈值之后的下一行必须为零,并且再次从这个新原点重新开始求和。

I have added a picture of my desired output.

最佳答案

我希望我正确理解了你的问题。您可以将生成器与 .send() 一起使用:

def my_accumulate(maxval):
val = 0
yield
while True:
if val < maxval:
val += yield val
else:
yield val
val = 0


def fn(x):
a = my_accumulate(10)
next(a)
x["cumu"] = [a.send(v) for v in x["TimeDistance"]]
return x


df = df.groupby("trip").apply(fn)
print(df)

打印:

    trip  timestamps  TimeDistance  cumu
0 1 1235471761 NaN 0.0
1 1 1235471763 2.0 2.0
2 1 1235471765 2.0 4.0
3 1 1235471767 2.0 6.0
4 1 1235471770 3.0 9.0
5 1 1235471772 2.0 11.0
6 1 1235471776 4.0 0.0
7 1 1235471779 3.0 3.0
8 1 1235471780 1.0 4.0
9 1 1235471789 9.0 13.0
10 1 1235471792 3.0 0.0
11 1 1235471793 1.0 1.0
12 2 1235471829 NaN 0.0
13 2 1235471833 4.0 4.0
14 2 1235471835 2.0 6.0
15 2 1235471838 3.0 9.0
16 2 1235471844 6.0 15.0
17 2 1235471847 3.0 0.0
18 2 1235471848 1.0 1.0
19 2 1235471852 4.0 5.0
20 2 1235471855 3.0 8.0
21 2 1235471859 4.0 12.0
22 3 1235471900 NaN 0.0
23 3 1235471904 4.0 4.0
24 3 1235471911 7.0 11.0
25 3 1235471913 2.0 0.0

另一个解决方案:

df = df.groupby("trip").apply(
lambda x: x.assign(
cumu=(
val := 0,
*(
val := val + v if val < 10 else (val := 0)
for v in x["TimeDistance"][1:]
),
)
),
)
print(df)

关于python - 如何计算累积总和直到达到阈值并在达到阈值后重置它考虑Python中的pandas数据帧中的组?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74104136/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com