gpt4 book ai didi

python - 如何搜索一列并用找到的内容填充另一列?

转载 作者:行者123 更新时间:2023-12-02 08:10:55 26 4
gpt4 key购买 nike

我有一个带有虚构人物数据的 Pandas 数据框。下面是一个小例子 - 每个人都由一个数字定义。

import pandas as pd
import numpy as np
df = pd.DataFrame({ 'Number':["5569", "3385", "9832", "6457", "5346", "5462", "9873", "2366"] , 'Gender': ['Male', 'Male', 'Female', 'Male', 'Female', 'Female', 'Male', 'Female'], 'Children': [np.nan, "5569 6457", "5569", np.nan, "6457", "2366", "2366", np.nan]})

df
Number Gender Children
0 5569 Male NaN
1 3385 Male 5569 6457
2 9832 Female 5569
3 6457 Male NaN
4 5346 Female 6457
5 5462 Female 2366
6 9873 Male 2366
7 2366 Female NaN

有些人是其他人的 child 。现在我想制作两列“母亲”和“父亲”,并用相关数字填充它们。我会通过查看“ child ”栏,然后将某个人添加为父亲(如果他们是男性并且在“ child ”中拥有 child 的编号)来获得这些信息,而女性作为母亲也是如此。然而,有些值为 NaN,有些人有多个 child (在实际数据集中他们可能有超过 4 个 child )。

我一直在尝试使用 .isin 和类似的方法,但我就是无法让它工作。

他们期望此示例的输出如下所示:

df = pd.DataFrame({ 'Number':["5569", "3385", "9832", "6457", "5346", "5462", "9873", "2366"] , 'Gender': ['Male', 'Male', 'Female', 'Male', 'Female', 'Female', 'Male', 'Female'], 'Children': [np.nan, "5569 6457", "5569", np.nan, "6457", "2366", "2366", np.nan], 'Mother':[9832, np.nan, np.nan,"5346", np.nan, np.nan, np.nan, "5462"], 'Father':["3385", np.nan, np.nan, "3385", np.nan, np.nan, np.nan, "9873"]})

df
Number Gender Children Mother Father
0 5569 Male NaN 9832 3385
1 3385 Male 5569 6457 NaN NaN
2 9832 Female 5569 NaN NaN
3 6457 Male NaN 5346 3385
4 5346 Female 6457 NaN NaN
5 5462 Female 2366 NaN NaN
6 9873 Male 2366 NaN NaN
7 2366 Female NaN 5462 9873

最佳答案

使用

df = df.join(df.assign(Children=df['Children'].str.split(' '))
.explode('Children')
.assign(Children = lambda x: pd.to_numeric(x['Children'],
errors = 'coerce'))
.pivot_table(columns='Gender',
index ='Children',
values = 'Number',
fill_value=0)
.rename(columns = {'Female':'Mother','Male':'Father'}),
on = 'Number')
print(df)
Number Gender Children Mother Father
0 5569 Male NaN 9832.0 3385.0
1 3385 Male 5569 6457 NaN NaN
2 9832 Female 5569 NaN NaN
3 6457 Male NaN 5346.0 3385.0
4 5346 Female 6457 NaN NaN
5 5462 Female 2366 NaN NaN
6 9873 Male 2366 NaN NaN
7 2366 Female NaN 5462.0 9873.0

请注意,由于使用 Series.str.split,因此子列的每个单元格中的值之间的空格数非常重要。

关于python - 如何搜索一列并用找到的内容填充另一列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59790303/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com