gpt4 book ai didi

python - 在另一个数据框行内的数据框行中查找单词

转载 作者:行者123 更新时间:2023-11-28 16:55:30 24 4
gpt4 key购买 nike

我想检查数据帧 B 行中的单词是否存在于另一个数据帧 A 行中,并检索数据帧 A 的行号。

数据帧A的例子

      LineNumber               Description
2539 5401845 Either the well was very deep, or she fell very slowly,
4546 5409117 for she had plenty of time as she went down to look about her,
4368 5408517 and to wonder what was going to happen next

数据框 B 的例子

                 Words
50062 well deep fell
44263 plenty time above
4731 plenty time down look

我现在想知道数据帧 B 的每一行中的所有单词是否都在数据帧 A 的任何行内。如果是这种情况,我将从数据帧 A 中检索 LineNumber 并将其分配给数据帧 B。

输出应该是这样的。

                     Words             LineNumber
50062 well deep fell 5401845
44263 plenty time above
4731 plenty time down look 5409117

我试过类似的方法,但没有用

a = 'for she had plenty of time as she went down to look about her,'
str = 'plenty time down look'
if all(x in str for x in a):
print(True)
else:
print(False)

谢谢

最佳答案

Make DataFrames

x = pd.DataFrame({"Description": ["for she had plenty of time as she went down to look about her",
"for she had of time as she went down to look about her"]})

>>> x
Description
0 for she had plenty of time as she went down to look about her
1 for she had of time as she went down to look about her

y = pd.DataFrame({"Description": ["plenty time down look"]})
>>> y
Description
0 plenty time down look

Match Description from dataframe y by index to dataframe x and get matching index from dataframe x

with_words = y["Description"].iloc[[0]].item().split()
with_regex = "".join(['(?=.*{})'.format(word) for word in with_words])

>>> with_regex
'(?=.*plenty)(?=.*time)(?=.*down)(?=.*look)'

>>> x.loc[(x.Description.str.contains(with_regex))].index.item()
0

关于python - 在另一个数据框行内的数据框行中查找单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58772335/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com