gpt4 book ai didi

python - 使用 Numpy 和 Pandas 优化 Python 代码

转载 作者:太空宇宙 更新时间:2023-11-03 21:42:44 25 4
gpt4 key购买 nike

我有以下代码可以运行:

import numpy as np
import pandas as pd
colum1 = [0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05]
colum2 = [1,2,3,4,5,6,7,8,9,10,11,12]
colum3 = [0.85,0.80,0.80,0.80,0.85,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
colum4 = [1743.85, 1485.58, 1250.07, 1021.83, 818.96, 628.05, 455.40, 319.03, 190.86 , 97.07, 26.96 , 0.00]
df = pd.DataFrame({
'colum1' : colum1,
'colum2' : colum2,
'colum3' : colum3,
'colum4' : colum4,
});

df['result'] = 0
for i in range(len(colum2)):
df['result'] = np.where(
df['colum2'] <= 5,
np.where(
df['colum2'] == 1,
df['colum4'],
np.where(
( df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3'])) )>0,
( df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3'])) ),
0
)
),
np.where(
( df['colum4'] - (df['result'].shift(1) * df['colum1']) )>0,
( df['colum4'] - (df['result'].shift(1) * df['colum1']) ),
0
)
)

并且我需要执行相同的操作而不诉诸 for 循环。这将非常有帮助,因为我正在处理数千条记录,这非常慢。

我的预期结果如下:

    colum1  colum2  colum3   colum4       result0     0.05       1    0.85  1743.85  1743.8500001     0.05       2    0.80  1485.58  1415.8260002     0.05       3    0.80  1250.07  1193.4369603     0.05       4    0.80  1021.83   974.0925224     0.05       5    0.85   818.96   777.5610685     0.05       6    0.00   628.05   589.1719476     0.05       7    0.00   455.40   425.9414037     0.05       8    0.00   319.03   297.7329308     0.05       9    0.00   190.86   175.9733549     0.05      10    0.00    97.07    88.27133210    0.05      11    0.00    26.96    22.54643311    0.05      12    0.00     0.00     0.000000

最佳答案

第一步是删除索引上的循环,并将大于 0 的数字的测试替换为 np.maximum 。这是可行的,因为 np.where(a > 0, a, 0) 对于我们的目的来说相当于 np.maximum(0, a)

同时单独定义较长的表达式以使代码可读:

s1 = df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3']))
s2 = df['colum4'] - (df['result'].shift(1) * df['colum1'])

df['result'] = np.where(df['colum2'] <= 5,
np.where(df['colum2'] == 1, df['colum4'],
np.maximum(0, s1)),
np.maximum(0, s2))

下一步是使用np.select删除嵌套的 np.where 语句:

m1 = df['colum2'] <= 5
m2 = df['colum2'] == 1

conds = [m1 & m2, m1 & ~m2]
choices = [df['colum4'], np.maximum(0, s1)]

df['result'] = np.select(conds, choices, np.maximum(0, s2))

此版本将更易于管理。

关于python - 使用 Numpy 和 Pandas 优化 Python 代码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52730328/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com