gpt4 book ai didi

python - 匹配来自两个不同数据帧的键

转载 作者:太空宇宙 更新时间:2023-11-04 02:40:22 25 4
gpt4 key购买 nike

我有两个数据框,

df1,
Name Stage Description key
0 Sri 1 Sri is one of the good singer in this two one
1 NaN 2 Thanks for reading two has
2 Ram 1 Ram is two of the good cricket player three
3 ganesh 1 one driver four
4 NaN 2 good buddies NaN


df2,
values
member of four
one of three friends
sri is a cricketer
Rahul has two brothers

如果 df2.values 中存在键,我想用 df2 值替换 df1["key"]。

I tried, df1["key"]=df2[df2["values"].str.contains("|".join(df2["values"].tolist()),na=False)]

但是我得到的输出顺序是一样的,

我要,

    output_df,
Name Stage Description key
0 Sri 1 Sri is one of the good singer in this two one of three friends
1 NaN 2 Thanks for reading Rahul has two brothers
2 Ram 1 Ram is two of the good cricket player one of three friends
3 ganesh 1 one driver member of four
4 NaN 2 good buddies NaN

最佳答案

我将使用集合数组并使用 <=用于子集测试和 numpy 广播。

setify = lambda x: set(x.split())
v = df2['values'].values.astype(str)
k = df1['key'].values.astype(str)
i = df1.index

# These the sets
a = np.array([setify(x) for x in k.tolist()])
b = np.array([setify(x) for x in v.tolist()])

# This is the broadcasting
matches = (a[:, None] <= b)

# Additional testing that there exist any matches
any_ = matches.any(1)
# Test that wasn't null in the first place
nul_ = df1['key'].notnull().values
mask = any_ & nul_

# And argmax to find where the first set match is. There
# may be more than one match. I chose to use `assign`
# therefore I used `mask` to pass a slice of a series
# to target the correct rows.
df1.assign(key1=pd.Series(v[matches.argmax(1)], i)[mask])

Name Stage Description key key1
0 Sri 1 Sri is one of the good singer in this two one one of three friends
1 NaN 2 Thanks for reading two has Rahul has two brothers
2 Ram 1 Ram is two of the good cricket player three one of three friends
3 ganesh 1 one driver four member of four
4 NaN 2 good buddies NaN NaN

关于python - 匹配来自两个不同数据帧的键,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46724163/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com