gpt4 book ai didi

python - 在 python 中的 pandas 中匹配数据帧之间的行

转载 作者:太空宇宙 更新时间:2023-11-04 09:52:17 24 4
gpt4 key购买 nike

我有两个数据框,

df1,

 Names
one two three
Sri is a good player
Ravi is a mentor
Kumar is a cricketer

df2,

 values
sri
NaN
sri, is
kumar,cricketer

我正在尝试获取 df1 中包含 df2 中所有项目的行

我的预期输出是,

 values             Names
sri Sri is a good player
NaN
sri, is Sri is a good player
kumar,cricketer Kumar is a cricketer

我试过了,df1["Names"].str.contains("|".join(df2["values"].values.tolist()))

但我无法达到预期的输出(“,”)。请帮忙

最佳答案

使用集合

s1 = df1.Names.dropna()
s1.loc[:] = [set(x.lower().split()) for x in s1.values.tolist()]
a1 = s1.values

s2 = df2['values'].dropna()
s2.loc[:] = [set(x.replace(' ', '').lower().split(',')) for x in s2.values.tolist()]
a2 = s2.values

i = np.column_stack([a1 >= a2[:, None], [True] * len(a2)]).argmax(1)

df2.assign(Names=pd.Series(
np.append(df1.Names.values, np.nan)[i], s2.index
))

values Names
0 sri Sri is a good player
1 NaN NaN
2 sri, is Sri is a good player
3 kumar,cricketer Kumar is a cricketer

关于python - 在 python 中的 pandas 中匹配数据帧之间的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47305897/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com