gpt4 book ai didi

python - 如何有效地将多个列添加到 pandas 数据框中,其值依赖于其他动态列

转载 作者:太空宇宙 更新时间:2023-11-03 14:06:55 25 4
gpt4 key购买 nike

如何使用更好的解决方案而不是以下代码?在具有大量列的大数据集中,此代码花费太多时间

import pandas as pd

df = pd.DataFrame({'Jan':[10,20], 'Feb':[3,5],'Mar':[30,4],'Month':
[3,2],'Year':[2016,2016]})

# Jan Feb Mar Month Year
# 0 10 3 30 3 2016
# 1 20 5 4 2 2016

df1['Antal_1']= np.nan
df1['Antal_2']= np.nan

for i in range(len(df)):
if df['Yaer'][i]==2016:
df['Antal_1'][i]=df.iloc[i,df['Month'][i]-1]
df['Antal_2'][i]=df.iloc[i,df['Month'][i]-2]
else:
df['Antal_1'][i]=df.iloc[i,-1]
df['Antal_2'][i]=df.iloc[i,-2]
df
# Jan Feb Mar Month Year Antal_1 Antal_2
# 0 10 3 30 3 2016 30 3
# 1 20 5 4 2 2016 5 20

最佳答案

使用df.apply而不是迭代行,您应该会看到边际加速:

import pandas as pd

df = pd.DataFrame({'Jan': [10, 20], 'Feb': [3, 5], 'Mar': [30, 4],
'Month': [3, 2],'Year': [2016, 2016]})

df = df[['Jan', 'Feb', 'Mar', 'Month', 'Year']]

def calculator(row):
m1 = row['Month']
m2 = row.index.get_loc('Month')
return (row[int(m1-1)], row[int(m1-2)]) if row['Year'] == 2016 \
else (row[m2-1], row[m2-2])

df['Antal_1'], df['Antal_2'] = list(zip(*df.apply(calculator, axis=1)))

# Jan Feb Mar Month Year Antal_1 Antal_2
# 0 10 3 30 3 2016 30 3
# 1 20 5 4 2 2016 5 20

关于python - 如何有效地将多个列添加到 pandas 数据框中,其值依赖于其他动态列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48815842/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com