gpt4 book ai didi

Python 去聚合

转载 作者:太空狗 更新时间:2023-10-30 01:18:28 25 4
gpt4 key购买 nike

我有一个在两个日期之间聚合的数据集,我想通过将总数除以这些日期之间的天数来每天取消聚合。作为样本

StoreID Date_Start    Date_End     Total_Number_of_sales
78 12/04/2015 17/05/2015 79089
80 12/04/2015 17/05/2015 79089

我要的数据集是:

StoreID Date         Number_Sales 
78 12/04/2015 79089/38(as there are 38 days in between)
78 13/04/2015 79089/38(as there are 38 days in between)
78 14/04/2015 79089/38(as there are 38 days in between)
78 ...
78 17/05/2015 79089/38(as there are 38 days in between)

任何帮助都是有用的。谢谢

最佳答案

我不确定这是否正是您想要的,但您可以试试这个(我添加了另一个假想的行):

import datetime as dt
df = pd.DataFrame({'date_start':['12/04/2015','17/05/2015'],
'date_end':['18/05/2015','10/06/2015'],
'sales':[79089, 1000]})

df['date_start'] = pd.to_datetime(df['date_start'], format='%d/%m/%Y')
df['date_end'] = pd.to_datetime(df['date_end'], format='%d/%m/%Y')
df['days_diff'] = (df['date_end'] - df['date_start']).dt.days


master_df = pd.DataFrame(None)
for row in df.index:
new_df = pd.DataFrame(index=pd.date_range(start=df['date_start'].iloc[row],
end = df['date_end'].iloc[row],
freq='d'))
new_df['number_sales'] = df['sales'].iloc[row] / df['days_diff'].iloc[row]
master_df = pd.concat([master_df, new_df], axis=0)

首先将字符串日期转换为日期时间对象(这样您就可以计算范围之间的天数),然后根据日期范围创建一个新索引,并划分销售额。该循环将数据帧的每一行粘贴到一个“扩展”数据帧中,然后将它们连接到一个主数据帧中。

关于Python 去聚合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51746226/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com