gpt4 book ai didi

python - 如何添加一列 timedelta 并将其与另一列关联?

转载 作者:太空宇宙 更新时间:2023-11-03 20:46:37 25 4
gpt4 key购买 nike

我有 2 个数据帧,一个按 user_id 分组并计算显示 user_id 的次数。另一个数据框是用户使用服务的时间和日期。我想要做的是按最早日期 - 最新日期计算第二个数据帧的时间增量,然后将时间增量添加到第一个数据帧中,甚至再多一列来从时间增量中提取天数。我认为可能需要循环来迭代 user_id。我尝试了很多次,但都达不到我想要的结果。

df1 = pd.DataFrame({'user_id': ['8', '2','5', '1', '10', '4'], 'usage_times':[466,423,401,350,352,333]})
df2 = pd.DataFrame({'user_id': ['1', '5','5', '8', '8', '1'], 'Date':['2010-11-16 16:44:52','2010-06-01 00:34:38','2010-05-31 05:01:24','2010-06-01 00:29:30','2010-09-11 23:55:00','2010-08-10 13:00:00']})
df1:
user_id usage_times
8 466
2 423
5 401
1 350
10 352
4 333
df2:
user_id Date
1 2010-11-16 16:44:52
5 2010-06-01 00:34:38
5 2010-05-31 05:01:24
8 2010-06-01 00:29:30
8 2010-09-11 23:55:00
1 2010-08-10 13:00:00

我尝试过的代码是:

for users in top_users.user_id:
latest_trip = df_final[(df_final['user_id'] == users)]['start_at'].max()
earliest_trip = df_final[(df_final['user_id'] == users)]['start_at'].min()
usage_period = earliest_trip - latest_trip
times = days_hours_minutes(usage_period)
top_users['period'] = top_users.apply(lambda x: list(x) for x in times)

我希望数据框变成这样:

df1:
user_id usage_times period days
8 466 100 days, 00:23:45 100
2 423 15 days, 00:05:45 15
5 401 104 days, 00:23:45 104
1 350 72 days, 00:15:45 72
10 352 40 days, 00:23:45 40
4 333 28 days, 00:43:45 28

最佳答案

IIUC 您可以合并 df1 和 df2,并使用 groupby 创建period

df = df1.merge(df2, on='user_id')
df['period'] = df.groupby('user_id')['Date'].transform(lambda x: x.max() - x.min() )
df['days'] = df['period'].dt.days
df.drop_duplicates('user_id', inplace=True)
df.drop(columns=['Date'], inplace = True)
df.head()


user_id usage_times period days
0 8 466 102 days 23:25:30 102
2 5 401 0 days 19:33:14 0
4 1 350 98 days 03:44:52 98

关于python - 如何添加一列 timedelta 并将其与另一列关联?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56547315/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com