gpt4 book ai didi

python - pandas 根据相同日期合并行

转载 作者:行者123 更新时间:2023-12-01 06:50:35 26 4
gpt4 key购买 nike

我有一个数据框,其中包含不同日复一日日志的条目。但某些登录和注销值发生在同一天。例如df


+-------+-----------------------------+--------------+-----------------------------+---------------+-----------+------------+
| Id | LoginTime | LoginReading | LogoutTime | LogoutReading | Available | Calculated |
+-------+-----------------------------+--------------+-----------------------------+---------------+-----------+------------+
| 10036 | 2019-11-06 10:37:18.3743184 | 5054.68 | 2019-11-06 14:11:52.4833904 | 5057.94 | 500 | 530 |
| 10036 | 2019-11-11 12:46:20.7018683 | 4797.39 | 2019-11-11 15:55:48.9102874 | 4846.86 | 500 | 1091 |
| 10036 | 2019-11-11 15:56:42.7485310 | 4846.86 | 2019-11-11 17:20:40.2587713 | 4866.68 | 500 | 370 |
| 10036 | 2019-11-11 17:21:12.9416691 | 4866.68 | 2019-11-11 18:09:55.8326356 | 4892.33 | 500 | 311 |
| 10036 | 2019-11-14 08:08:00.6290260 | 5080.59 | 2019-11-14 18:43:31.8609822 | 5185.65 | 500 | 2528 |
| 10036 | 2019-11-15 06:00:48.4777280 | 5185.65 | 2019-11-15 06:31:55.0281168 | 5199.28 | 500 | 260 |
+-------+-----------------------------+--------------+-----------------------------+---------------+-----------+------------+

我正在尝试将日期相似的行压缩/合并到一行。我的目标是实现这样的目标:

+-------+-----------------------------+--------------+-----------------------------+---------------+-----------+------------+
| Id | LoginTime | LoginReading | LogoutTime | LogoutReading | Available | Calculated |
+-------+-----------------------------+--------------+-----------------------------+---------------+-----------+------------+
| 10036 | 2019-11-06 10:37:18.3743184 | 5054.68 | 2019-11-06 14:11:52.4833904 | 5057.94 | 500 | 530 |
| 10036 | 2019-11-11 12:46:20.7018683 | 4797.39 | 2019-11-11 18:09:55.8326356 | 4892.33 | 500 | 1772 |
| 10036 | 2019-11-14 08:08:00.6290260 | 5080.59 | 2019-11-14 18:43:31.8609822 | 5185.65 | 500 | 2528 |
| 10036 | 2019-11-15 06:00:48.4777280 | 5185.65 | 2019-11-15 06:31:55.0281168 | 5199.28 | 500 | 260 |
+-------+-----------------------------+--------------+-----------------------------+---------------+-----------+------------+

我尝试了一些其他类似的问题,但注销时间不起作用。

最佳答案

您可以创建一个新列来跟踪登录日期,然后可以使用 groupby()agg() 编译您想要的值:

df['Day'] = df['LoginTime'].str.extract(r'^(\d{4}-\d{2}-\d{2})')

df.groupby('Day').agg({'Id': 'first', 'LoginTime': 'first', 'LoginReading': 'first', 'LogoutTime': 'last',
'LogoutReading': 'last', 'Available': 'first', 'Calculated': 'sum'}).reset_index(drop=True)

产量:

      Id                    LoginTime  LoginReading  \
0 10036 2019-11-06 10:37:18.3743184 5054.68
1 10036 2019-11-11 12:46:20.7018683 4797.39
2 10036 2019-11-14 08:08:00.6290260 5080.59
3 10036 2019-11-15 06:00:48.4777280 5185.65

LogoutTime LogoutReading Available Calculated
0 2019-11-06 14:11:52.4833904 5057.94 500 530
1 2019-11-11 18:09:55.8326356 4892.33 500 1772
2 2019-11-14 18:43:31.8609822 5185.65 500 2528
3 2019-11-15 06:31:55.0281168 5199.28 500 260

关于python - pandas 根据相同日期合并行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59025606/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com