gpt4 book ai didi

python - Pandas Groupby.diff 用零填充缺失的行

转载 作者:行者123 更新时间:2023-12-01 02:18:08 24 4
gpt4 key购买 nike

我确信这是在某个地方发布的,或者太简单了我看不到它,但我没有运气找到一个帖子。任何帮助将不胜感激。

正如你所见,我正在尝试执行 groupby.diff 。如果缺少日期,我需要显示负值。

df['delta'] = df.groupby(['ID', 'ticker', 'date'])['shares'].diff()

ID ticker date shares delta
A AAA 3/31/2012 904180 675010
A AAA 12/31/2011 229170 NaN
A BBB 3/31/2012 517756 390117
A BBB 12/31/2011 127639 NaN
A CCC 12/31/2011 1757 NaN
A DDD 12/31/2011 500 NaN
B AAA 3/31/2012 920920 554920
B AAA 12/31/2011 366000 NaN
B BBB 3/31/2012 524 393
B BBB 12/31/2011 131 NaN

我想我需要填充/填充才能得到这个:

ID  ticker date         shares  delta
A AAA 3/31/2012 904180 675010
A AAA 12/31/2011 229170 NaN
A BBB 3/31/2012 517756 390117
A BBB 12/31/2011 127639 NaN
A CCC 3/31/2012 0 -1757
A CCC 12/31/2011 1757 NaN
A DDD 3/31/2012 0 -500
A DDD 12/31/2011 500 NaN
B AAA 3/31/2012 920920 554920
B AAA 12/31/2011 366000 NaN
B BBB 3/31/2012 524 393
B BBB 12/31/2011 131 NaN

再次感谢

最佳答案

使用unstack + stack

New_df=df.set_index(['ID','ticker','date']).unstack('date').stack(dropna=False).reset_index().fillna(0)
New_df['delta'] = New_df.groupby(['ID', 'ticker', 'date'])['shares'].diff()

# you should not groupby date, it will return all NaN after you did diff
New_df['delta'] = New_df.groupby(['ID', 'ticker'])['shares'].diff()
#New_df['delta'] = New_df.groupby(['ID', 'ticker','date'])['shares'].diff()
New_df
Out[316]:
ID ticker date shares delta
0 A AAA 12/31/2011 229170.0 NaN
1 A AAA 3/31/2012 904180.0 675010.0
2 A BBB 12/31/2011 127639.0 NaN
3 A BBB 3/31/2012 517756.0 390117.0
4 A CCC 12/31/2011 1757.0 NaN
5 A CCC 3/31/2012 0.0 -1757.0
6 A DDD 12/31/2011 500.0 NaN
7 A DDD 3/31/2012 0.0 -500.0
8 B AAA 12/31/2011 366000.0 NaN
9 B AAA 3/31/2012 920920.0 554920.0
10 B BBB 12/31/2011 131.0 NaN
11 B BBB 3/31/2012 524.0 393.0

排序后

New_df.sort_values(['ID','ticker','date'],ascending=[True,True,False])
Out[318]:
ID ticker date shares delta
1 A AAA 3/31/2012 904180.0 675010.0
0 A AAA 12/31/2011 229170.0 NaN
3 A BBB 3/31/2012 517756.0 390117.0
2 A BBB 12/31/2011 127639.0 NaN
5 A CCC 3/31/2012 0.0 -1757.0
4 A CCC 12/31/2011 1757.0 NaN
7 A DDD 3/31/2012 0.0 -500.0
6 A DDD 12/31/2011 500.0 NaN
9 B AAA 3/31/2012 920920.0 554920.0
8 B AAA 12/31/2011 366000.0 NaN
11 B BBB 3/31/2012 524.0 393.0
10 B BBB 12/31/2011 131.0 NaN

关于python - Pandas Groupby.diff 用零填充缺失的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48191680/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com