gpt4 book ai didi

python - 匹配两个数据框之间的字符串并创建列

转载 作者:太空宇宙 更新时间:2023-11-04 10:00:39 25 4
gpt4 key购买 nike

我正在尝试匹配从 bad_boygood_boy 的部分字符串,并在原始 df (bad_boy) 中创建一个名为 Right Address 但很难完成此操作。我查看了以下链接:

Replace whole string if it contains substring in pandas

Return DataFrame item using partial string match on rows pandas python

import pandas as pd
bad_boy = pd.read_excel('C:/Users/Programming/.xlsx')
df = pd.DataFrame(bad_boy)

print (df['Address'].head(3))

0 1234 Stack Overflow
1 7458 Python
2 8745 Pandas

good_boy = pd.read_excel('C:/Users/Programming/.xlsx')

df2 = pd.DataFrame(good_boy)

print (df2['Address'].head(10))

0 5896 Java Road
1 1234 Stack Overflow Way
2 7459 Ruby Drive
3 4517 Numpy Creek Way
4 1642 Scipy Trail
5 7458 Python Avenue
6 8745 Pandas Lane
7 9658 Excel Road
8 7255 Html Drive
9 7459 Selenium Creek Way

我试过这个:

df['Right Address'] = df.loc[df['Address'].str.contains('Address', case = False, na = False, regex = False), df2['Address']]

但这会抛出一个错误:

'None of [0.....all addresses\nName: Address, dtype: object] are in the [columns]'

请求的结果:

print (df['Right Address'].head(3))

0 1234 Stack Overflow Way
1 7458 Python Avenue
2 8745 Pandas Lane

最佳答案

您可以使用 merge 结合 str.extract 进行部分匹配

df1 = df1.merge(df2, left_on = df1.Address.str.extract('(\d+)', expand = False), right_on = df2.Address.str.extract('(\d+)', expand = False), how = 'inner').rename(columns = {'Address_y': 'Right_Address'})

你得到

    Address_x           Right_Address
0 1234 Stack Overflow 1234 Stack Overflow Way
1 7458 Python 7458 Python Avenue
2 8745 Pandas 8745 Pandas Lane

关于python - 匹配两个数据框之间的字符串并创建列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43766123/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com