gpt4 book ai didi

python - Pandas 高级 : How to get results for customer who has bought at least twice within 5 days of period?

转载 作者:行者123 更新时间:2023-12-02 16:44:50 25 4
gpt4 key购买 nike

几个小时以来,我一直在尝试解决一个问题并坚持不懈。这是问题概述:

import numpy as np
import pandas as pd


df = pd.DataFrame({'orderid': [10315, 10318, 10321, 10473, 10621, 10253, 10541, 10645],
'customerid': ['ISLAT', 'ISLAT', 'ISLAT', 'ISLAT', 'ISLAT', 'HANAR', 'HANAR', 'HANAR'],
'orderdate': ['1996-09-26', '1996-10-01', '1996-10-03', '1997-03-13', '1997-08-05', '1996-07-10', '1997-05-19', '1997-08-26']})
df

orderid customerid orderdate
0 10315 ISLAT 1996-09-26
1 10318 ISLAT 1996-10-01
2 10321 ISLAT 1996-10-03
3 10473 ISLAT 1997-03-13
4 10621 ISLAT 1997-08-05
5 10253 HANAR 1996-07-10
6 10541 HANAR 1997-05-19
7 10645 HANAR 1997-08-26

我想选择在 5 天内多次订购商品的所有客户。

比如这里只有客户在5天之内下单,他已经下单了两次。

我想得到以下格式的输出:

要求的输出

customerid  initial_order_id    initial_order_date  nextorderid nextorderdate   daysbetween
ISLAT 10315 1996-09-26 10318 1996-10-01 5
ISLAT 10318 1996-10-01 10321 1996-10-03 2

最佳答案

首先,为了能够计算天数差异,转换orderdate日期时间列:

df.orderdate = pd.to_datetime(df.orderdate)

然后定义如下函数:

def fn(grp):
return grp[(grp.orderdate.shift(-1) - grp.orderdate) / np.timedelta64(1, 'D') <= 5]

最后应用它:

df.sort_values(['customerid', 'orderdate']).groupby('customerid').apply(fn)

关于python - Pandas 高级 : How to get results for customer who has bought at least twice within 5 days of period?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60793624/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com