gpt4 book ai didi

相当于 SQL 窗口函数和行范围的 Pandas

转载 作者:行者123 更新时间:2023-12-03 14:31:15 25 4
gpt4 key购买 nike

考虑最小的例子

customer   day  purchase
Joe 1 5
Joe 1 10
Joe 2 5
Joe 2 5
Joe 4 10
Joe 7 5
在 BigQuery 中,可以做类似的事情来获取客户在过去 2 天内每天花费的金额:
SELECT customer, day
, sum(purchase) OVER (PARTITION BY customer ORDER BY day ASC RANGE between 2 preceding and 1 preceding)
FROM table
Pandas 中的等价物是什么?即,预期结果
customer   day  purchase    amount_last_2d
Joe 1 5 null -- spent days [-,-]
Joe 1 10 null -- spent days [-,-]
Joe 2 5 15 -- spent days [-,1]
Joe 2 5 15 -- spent days [-,1]
Joe 4 10 10 -- spent days [2,3]
Joe 7 5 0 -- spent days [5,6]

最佳答案

试试 groupbyshift然后 reindex背部

df['new'] = df.groupby(['customer','day']).purchase.sum().shift().reindex(pd.MultiIndex.from_frame(df[['customer','day']])).values
df
Out[259]:
customer day purchase new
0 Joe 1 5 NaN
1 Joe 1 10 NaN
2 Joe 2 10 15.0
3 Joe 2 5 15.0
4 Joe 4 10 15.0
更新
s = df.groupby(['customer','day']).apply(lambda x : df.loc[df.customer.isin(x['customer'].tolist()) & (df.day.isin(x['day']-1)|df.day.isin(x['day']-2)),'purchase'].sum())
df['new'] = s.reindex(pd.MultiIndex.from_frame(df[['customer','day']])).values
df
Out[271]:
customer day purchase new
0 Joe 1 5 0
1 Joe 1 10 0
2 Joe 2 5 15
3 Joe 2 5 15
4 Joe 4 10 10
5 Joe 7 5 0

关于相当于 SQL 窗口函数和行范围的 Pandas,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65959870/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com