gpt4 book ai didi

python - 每行用随机值替换 NaN

转载 作者:行者123 更新时间:2023-12-01 08:48:59 24 4
gpt4 key购买 nike

我有一个包含“Self_Employed”列的数据集。这些列中有值“Yes”、“No”和“NaN”。我想用 calc() 中计算的值替换 NaN 值。我尝试了在这里找到的一些方法,但找不到适合我的方法。这是我的代码,我把我尝试过的东西放在评论中。:

    # Handling missing data - Self_employed
SEyes = (df['Self_Employed']=='Yes').sum()
SEno = (df['Self_Employed']=='No').sum()

def calc():
rand_SE = randint(0,(SEno+SEyes))
if rand_SE > 81:
return 'No'
else:
return 'Yes'


> # df['Self_Employed'] = df['Self_Employed'].fillna(randint(0,100))
> #df['Self_Employed'].isnull().apply(lambda v: calc())
>
>
> # df[df['Self_Employed'].isnull()] = df[df['Self_Employed'].isnull()].apply(lambda v: calc())
> # df[df['Self_Employed']]
>
> # df_nan['Self_Employed'] = df_nan['Self_Employed'].isnull().apply(lambda v: calc())
> # df_nan
>
> # for i in range(df['Self_Employed'].isnull().sum()):
> # print(df.Self_Employed[i]


df[df['Self_Employed'].isnull()] = df[df['Self_Employed'].isnull()].apply(lambda v: calc())
df

现在我用 df_nan 尝试的行似乎可以工作,但后来我有一个单独的集合,其中仅包含以前的缺失值,但我想填充整个数据集中的缺失值。对于最后一行,我收到一个错误,我链接到它的屏幕截图。你明白我的问题吗?如果明白的话,你能帮忙吗?

This is the set with only the rows where Self_Employed is NaN

This is the original dataset

This is the error

最佳答案

确保 SEno+SEyes != null当Self_Employed为空时,使用.loc方法设置其值

SEyes = (df['Self_Employed']=='Yes').sum() + 1
SEno = (df['Self_Employed']=='No').sum()

def calc():
rand_SE = np.random.randint(0,(SEno+SEyes))
if(rand_SE >= 81):
return 'No'
else:
return 'Yes'

df.loc[df['Self_Employed'].isna(), 'Self_Employed'] = df.loc[df['Self_Employed'].isna(), 'Self_Employed'].apply(lambda x: calc())

关于python - 每行用随机值替换 NaN,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53209198/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com