gpt4 book ai didi

python - 如何加速Python for循环

转载 作者:太空宇宙 更新时间:2023-11-03 17:16:35 27 4
gpt4 key购买 nike

我有以下函数,其中 df 是 159538 行 x 3 列的 pandas 数据框:

dfs = []
for i in df['email_address']:
data = df[df['email_address'] == i]
data['difference'] = data['ts_placed'].diff().astype('timedelta64[D]')
repeat = []
for a in data['difference']:
if a > 10:
repeat.append(0)
elif a <= 10:
repeat.append(1)
else:
repeat.append(0)
data['repeat'] = repeat
dfs.append(data)

该函数运行速度非常慢。我想通过使用多处理来加速这个过程。这个所以question展示了如何在 R 中执行此操作。Python 的等效代码是什么?

这是运行后的数据示例:

df['difference'] = df.groupby('email_address')['ts_placed'].diff()



df
Out[6]:
email_address ts_placed difference
0 aaaaaaaaaaaaa@sky.com 2015-08-06 00:00:34 NaT
1 dfdfdfdfdfd@babcock.co.uk 2015-08-06 00:05:38 NaT
2 littlemifddreen85@hotmail.co.uk 2015-08-06 00:09:20 NaT
3 smifdfddfms@aol.com 2015-08-06 00:10:01 NaT
4 terry.wfdfdfdfdfy-holdings.co.uk 2015-08-06 00:14:00 NaT
5 r.dfdfdfdfd16@hotmail.com 2015-08-06 00:14:00 NaT
6 kdfdfdf979@outlook.com 2015-08-06 00:14:00 NaT
7 dd@ggggggggggg.eclipse.co.uk 2015-08-06 00:14:20 NaT
8 gggz45@hotmail.co.uk 2015-08-06 00:14:43 NaT
9 gggggggggi@hotmail.co.uk 2015-08-06 00:17:03 NaT
10 mggggggggyke1@hotmail.com 2015-08-06 00:17:58 NaT
...
22 ffdddfddd@yahoo.com 2015-08-06 00:46:12 0 days 00:04:15

最佳答案

IIUC 那么您可以执行以下操作:

df['difference'] = df.groupby('email_address')['ts_placed'].diff()

df['repeat'] = df.groupby('email_address')['difference'].transform(lambda x: (x < 10).cumcount())

关于python - 如何加速Python for循环,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33607455/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com