gpt4 book ai didi

python - 提取单词周围的单词并将结果插入数据框列中

转载 作者:行者123 更新时间:2023-11-30 22:32:24 24 4
gpt4 key购买 nike

我有一个数据框 df,包含 3 列,如下所示:

company | year | text  
Apple | 2016 |"The Company sells its products worldwide through its..."

我想在df['text']中搜索“products”,并提取“products”前后的3个单词,并将前后3个单词插入到数据框,分别为 df['before']df['after']

这是我到目前为止所做的:

m = re.search(r'((?:\w+\W+){,3})(products)\W+((?:\w+\W+){,3})', df['text'])       
merge['searchText'])
if m:
l = [ x.strip().split() for x in m.groups()]
df['left'], df['right'] = l[0], l[2]

但是,我收到此消息:

TypeError: expected string or buffer

我怎样才能让它发挥作用?

最佳答案

使用pd.Series.str.extract

pat = '(?P<before>(?:\w+\W+){,3})products\W+(?P<after>(?:\w+\W+){,3})'
new = df.text.str.extract(pat, expand=True)

new

before after
0 Company sells its worldwide through its...

您可以使用新列创建新数据框

df.assign(**new)

company year text after before
0 Apple 2016 The Company sells its products worldwide throu... worldwide through its... Company sells its

关于python - 提取单词周围的单词并将结果插入数据框列中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45470373/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com