gpt4 book ai didi

python - 计算 pandas 中每组的数值差异

转载 作者:行者123 更新时间:2023-12-01 04:52:14 25 4
gpt4 key购买 nike

我的数据框具有以下结构:

patient_id  |  timestamp  |  measurement
A | 2014-10-10 | 5.7
A | 2014-10-11 | 6.3
B | 2014-10-11 | 6.1
B | 2014-10-10 | 4.1

我想计算每位患者每次测量结果之间的delta(差异)。

结果应如下所示:

patient_id  |  timestamp  |  measurement  |    delta
A | 2014-10-10 | 5.7 | NaN
A | 2014-10-11 | 6.3 | 0.6
B | 2014-10-11 | 6.1 | 2.0
B | 2014-10-10 | 4.1 | NaN

如何在 pandas 中最优雅地完成此操作?

最佳答案

调用transform在“测量”列上并传递方法 diff ,transform 返回一个索引与原始 df 对齐的序列:

In [4]:

df['delta'] = df.groupby('patient_id')['measurement'].transform(pd.Series.diff)
df
Out[4]:
patient_id timestamp measurement delta
0 A 2014-10-10 5.7 NaN
1 A 2014-10-11 6.3 0.6
2 B 2014-10-10 4.1 NaN
3 B 2014-10-11 6.1 2.0

编辑

如果您打算对 transform 的结果进行某种排序,请先对 df 进行排序:

In [10]:

df['delta'] = df.sort(columns=['patient_id', 'timestamp']).groupby('patient_id')['measurement'].transform(pd.Series.diff)
df
Out[10]:
patient_id timestamp measurement delta
0 A 2014-10-10 5.7 NaN
1 A 2014-10-11 6.3 0.6
2 B 2014-10-11 6.1 2.0
3 B 2014-10-10 4.1 NaN

关于python - 计算 pandas 中每组的数值差异,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28178740/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com