gpt4 book ai didi

python - 我正在尝试将 Pandas 中的全名拆分为第一个中间名和姓氏,但我陷入了替换

转载 作者:太空宇宙 更新时间:2023-11-03 15:08:54 27 4
gpt4 key购买 nike

我试图将名称分成两部分,并保留名字姓氏,最后替换所有部分中的公共(public)部分,这样名字必须然后是姓氏,然后如果保留中间名,则将其添加到列中

df['owner1_first_name'] = df['owner1_name'].str.split().str[0].astype(str, 
errors='ignore')
df['owner1_last_name'] =
df['owner1_name'].str.split().str[-1].str.replace(df['owner1_first_name'],
"").astype(str, errors='ignore')
['owner1_middle_name'] =
df['owner1_name'].str.replace(df['owner1_first_name'],
"").str.replace(df['owner1_last_name'], "").astype(str, errors='ignore')

问题是我无法使用 .str.replace(df['owner1_name'], "")
因为我收到错误 “TypeError:‘Series’对象是可变的,因此它们不能被散列”

pandas 中是否有任何替代语法来实现我想要实现的目标

我想要的输出是

全名= THOMAS MARY D,位于owner1_name列中

我想要

owner1_first_name = THOMAS
owner1_middle_name = MARY
owner1_last_name = D

最佳答案

我认为你需要mask如果两列中的值相同,则将其替换为空字符串:

df = pd.DataFrame({'owner1_name':['THOMAS MARY D', 'JOE Long', 'MARY Small']})

splitted = df['owner1_name'].str.split()
df['owner1_first_name'] = splitted.str[0]
df['owner1_last_name'] = splitted.str[-1]
df['owner1_middle_name'] = splitted.str[1]
df['owner1_middle_name'] = df['owner1_middle_name']
.mask(df['owner1_middle_name'] == df['owner1_last_name'], '')
print (df)
owner1_name owner1_first_name owner1_last_name owner1_middle_name
0 THOMAS MARY D THOMAS D MARY
1 JOE Long JOE Long
2 MARY Small MARY Small

与以下内容相同:

splitted = df['owner1_name'].str.split()
df['owner1_first_name'] = splitted.str[0]
df['owner1_last_name'] = splitted.str[-1]
middle = splitted.str[1]
df['owner1_middle_name'] = middle.mask(middle == df['owner1_last_name'], '')
print (df)
owner1_name owner1_first_name owner1_last_name owner1_middle_name
0 THOMAS MARY D THOMAS D MARY
1 JOE Long JOE Long
2 MARY Small MARY Small

编辑:

对于按行替换,可以使用applyaxis=1:

df = pd.DataFrame({'owner1_name':['THOMAS MARY-THOMAS', 'JOE LongJOE', 'MARY Small']})

splitted = df['owner1_name'].str.split()
df['a'] = splitted.str[0]
df['b'] = splitted.str[-1]

df['c'] = df.apply(lambda x: x['b'].replace(x['a'], ''), axis=1)
print (df)
owner1_name a b c
0 THOMAS MARY-THOMAS THOMAS MARY-THOMAS MARY-
1 JOE LongJOE JOE LongJOE Long
2 MARY Small MARY Small Small

在三行中实现我想要的问题的确切代码是

df['owner1_first_name'] = df['owner1_name'].str.split().str[0]
df['owner1_last_name'] = df.apply(lambda x: x['owner1_name'].split()
[-1].replace(x['owner1_first_name'], ''), axis=1)
df['owner1_middle_name'] = df.apply(lambda x:
x['owner1_name'].replace(x['owner1_first_name'],
'').replace(x['owner1_last_name'], ''), axis=1)

关于python - 我正在尝试将 Pandas 中的全名拆分为第一个中间名和姓氏,但我陷入了替换,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44406207/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com