gpt4 book ai didi

python - 如何加速python中的条件语句

转载 作者:行者123 更新时间:2023-12-04 08:21:18 24 4
gpt4 key购买 nike

我试图通过循环超过 100,000 行并在现有行上设置行的值来在 Pandas 数据框中生成一个新列。
当前的数据帧是一个虚拟的,但可以作为一个例子。我目前的代码是:

df=pd.DataFrame({'IT100':[5,5,-0.001371,0.0002095,-5,0,-5,5,5],
'ET110':[0.008187884,0.008285232,0.00838258,0.008479928,1,1,1,1,1]})

# if charging set to 1, if discharging set to -1.
# if -1 < IT100 < 1 then set CD to previous cells value
# Charging is defined as IT100 > 1 and Discharge is defined as IT100 < -1


def CD(dataFrame):


for x in range(0,len(dataFrame.index)):

current = dataFrame.loc[x,"IT100"]

if x == 0:
if dataFrame.loc[x+5,"IT100"] > -1:
dataFrame.loc[x,"CD"] = 1
else:
dataFrame.loc[x,"CD"] = -1
else:
if current > 1:
dataFrame.loc[x,"CD"] = 1
elif current < -1:
dataFrame.loc[x,"CD"] = -1
else:
dataFrame.loc[x,"CD"] = dataFrame.loc[x-1,"CD"]
使用 if/Else 循环非常慢。我看到人们建议使用 np.select() 或 pd.apply(),但我不知道这是否适用于我的示例。我需要能够索引该列,因为我的条件之一是将新列的值设置为感兴趣列中前一个单元格的值。
谢谢你的帮助!

最佳答案

@Grajdeanu Alex 是对的,循环比你在它里面做的任何事情都让你放慢速度。对于 Pandas ,循环通常是最慢的选择。尝试这个:

import pandas as pd
import numpy as np
df = pd.DataFrame({'IT100':[0,-50,-20,-0.5,-0.25,-0.5,-10,5,0.5]})
df['CD'] = np.nan
#lower saturation
df.loc[df['IT100'] < -1,['CD']] = -1
#upper saturation
df.loc[df['IT100'] > 1,['CD']] = 1
#fill forward
df['CD'] = df['CD'].ffill()
# setting the first row equal to the fifth
df.loc[0,['CD']] = df.loc[5,['CD']]

使用 ffill将使用最后一个有效值来填充后续的 nan 值 (-1 < x < 1)

关于python - 如何加速python中的条件语句,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65485321/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com