gpt4 book ai didi

python - 我如何检查数据框中行之间的相似性并添加一列作为计数器和增量。当行匹配时呢?

转载 作者:行者123 更新时间:2023-11-30 09:47:43 25 4
gpt4 key购买 nike

对 python(Pandas)有点陌生,请帮我解决这个问题

这就是我的数据框的样子:-Device_id 是在时间(1524724677)显示(消息)的设备的 ID,时间以纪元为单位。

  Device_Id    Msg                Time
0 ABC123 connected 1524724677
1 ABC123 connected 1524724679
2 XYZ123 device failed 1524724814
3 ABC123 connected 1524725279
4 XVZ123 device failed 1524725300
5 PQR123 error 1524725325
6 ABC123 connected 1524725345

我必须对数据帧的每一行执行操作,以便我可以添加一些新列。

我想要的数据框看起来像:-

  Device_Id    Msg                Time       count
0 ABC123 connected 1524724677 1
1 ABC123 connected 1524724679 2
2 XYZ123 device failed 1524724814 1
3 ABC123 connected 1524725279 1
4 XVZ123 device failed 1524725300 1
5 PQR123 error 1524725325 1
6 ABC123 connected 1524725345 2

此计数列的工作方式与例如:

请阅读所有要点,以明确计数列的工作原理

--for row(0) count is (1), bcoz this is the unique device
--we will increase the counter w.r.t (Time)
--we will reset the counter values after every 10 minutes
--for row(1) count is (2), bcoz time (1524724679) is between
1524724677 and 1524724677 + 10 minutes.
--for row(2), it is unique device and time(1524724679)
between 1524724677 and 1524724677 + 10 minutes so count is (1).
--for row(3), notice it is not unique device then also it has count=1
bcoz, time(1524725279) is not between 1524724677 and 1524724677 + 10
minutes. (Count reset)
--for col(4) count is (1), bcoz time (1524725300) is between
1524725279 and 1524725279 + 10 minutes.
--for col(5), count=1, unique device and time (1524725325) between 1524725279
and 1524725279 + 10 minutes.
--for col(6) count=2, bcoz time(1524725345) is between 1524725279
and 1524725279 + 10 minutes.

计数值每 10 分钟重置一次,这意味着每个 device_id 将从 (1) 开始。

每 10 分钟后,每个唯一的 device_id 将被视为新的,这就是为什么计数重新从 1 开始并在接下来的 10 分钟内保持其值。

最佳答案

您可以使用 groupby 和 grouper函数可以轻松解决这个问题:

# convert time
df['Time'] = pd.to_datetime(df['Time'], unit='s')

# get output
df['count'] = df.groupby(['Device_Id', pd.Grouper(key='Time', freq='10min')]).cumcount()+1

print(df)

Device_Id Msg Time count
0 ABC123 connected 2018-04-26 06:37:57 1
1 ABC123 connected 2018-04-26 06:37:59 2
2 XYZ123 device failed 2018-04-26 06:40:14 1
3 ABC123 connected 2018-04-26 06:47:59 1
4 XVZ123 device failed 2018-04-26 06:48:20 1
5 PQR123 error 2018-04-26 06:48:45 1
6 ABC123 connected 2018-04-26 06:49:05 2

关于python - 我如何检查数据框中行之间的相似性并添加一列作为计数器和增量。当行匹配时呢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50212175/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com