gpt4 book ai didi

python - 当其他列满足某些条件时,如何使用 fillna() 估算列中的值

转载 作者:太空宇宙 更新时间:2023-11-03 14:03:29 25 4
gpt4 key购买 nike

我计算了credit_history 具有 NaN 值时的计数。

Credit_History 为 NaN 时的输出:

Self_Employed
Yes 532
No 32

Married
No 398
Yes 21

对于数值,我计算了所有列的平均值

当 Credit_History 为 NaN 时非数值的输出:

Mean Applicant Income: 54003.1232
LoanAmount: 35435.12
Loan_Amount_Term: 360
ApplicantIncome: 30000

在这些情况下我现在如何使用 fillna():

情况 1:当 Self_Employed = Y 且 Married = N 时; Credit_History 应为 0

情况 2:当 Self_Employed = N 且 ApplicantIncome > 20000 时; Credit_History 应为 1

案例3:当Self_Employed = Y,Married = N且ApplicantIncome > 2000时; Credit_History 应为 1

另外,当使用 fillna() 在某些条件下不太明显时,我们可以使用数据透视表来计算中值,然后使用 fillna() 进行插补吗?

提前致谢。

最佳答案

使用numpy.select如果所有条件均为False,则输出由参数default定义:

from  itertools import  product
c = ['Self_Employed','Married','ApplicantIncome']
df = pd.DataFrame(list(product(list('NY'), list('NY'), [10000, 30000])),
columns=c)


m1 = (df.Self_Employed == 'Y') & (df.Married == 'N')
m2 = (df.Self_Employed == 'N') & (df.ApplicantIncome > 20000)
m3 = m1 & (df.ApplicantIncome > 20000)

df['Credit_History'] = np.select([m1, m2, m3], [0,1,1], default=2)
print (df)
Self_Employed Married ApplicantIncome Credit_History
0 N N 10000 2
1 N N 30000 1
2 N Y 10000 2
3 N Y 30000 1
4 Y N 10000 0
5 Y N 30000 0
6 Y Y 10000 2
7 Y Y 30000 2
<小时/>

但如果想按条件替换,请添加 fillna :

c = ['Self_Employed','Married','ApplicantIncome']
df = pd.DataFrame(list(product(list('NY'), list('NY'), [10000, 30000])),
columns=c).assign(Credit_History=[np.nan,1,0, np.nan] *2)
print (df)
Self_Employed Married ApplicantIncome Credit_History
0 N N 10000 NaN
1 N N 30000 1.0
2 N Y 10000 0.0
3 N Y 30000 NaN
4 Y N 10000 NaN
5 Y N 30000 1.0
6 Y Y 10000 0.0
7 Y Y 30000 NaN

m1 = (df.Self_Employed == 'Y') & (df.Married == 'N')
m2 = (df.Self_Employed == 'N') & (df.ApplicantIncome > 20000)
m3 = m1 & (df.ApplicantIncome > 20000)

s = pd.Series(np.select([m1, m2, m3], [0,1,1], default=2), index=df.index)
df['Credit_History'] = df['Credit_History'].fillna(s)
print (df)
Self_Employed Married ApplicantIncome Credit_History
0 N N 10000 2.0
1 N N 30000 1.0
2 N Y 10000 0.0
3 N Y 30000 1.0
4 Y N 10000 0.0
5 Y N 30000 1.0
6 Y Y 10000 0.0
7 Y Y 30000 2.0

关于python - 当其他列满足某些条件时,如何使用 fillna() 估算列中的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49088259/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com