gpt4 book ai didi

python - 循环遍历字符串项列表并返回那些包含 python 子字符串的项

转载 作者:行者123 更新时间:2023-11-28 18:09:55 24 4
gpt4 key购买 nike

我试图循环遍历一个句子列表,只提取列表中包含子字符串(关键字)的项目,当在我的函数中使用 return 而不是 yield 时,我得到一个字符列表 vs yield 我得到完整的句子,但我知道它是一个生成器,并且想要包含该单词的每个句子的完整列表。是 .find() 导致了问题还是有更好的方法从字符串项列表中提取?

import nltk
from nltk import *
import pandas as pd
f= open("filename.txt").read()
sent_list = sent_tokenize(f)

hunt = "youth" #keyword i'm searching for
def hunter(sent):
for term in sent:
if term.find(hunt) is not -1:
yield term

complete_lst = [term for term in hunter(sent_list)]
df = pd.DataFrame({'key_term_sentences':complete_lst})

最佳答案

您的代码中有几个错误,其中一个是不使用 split。修复后,一切正常。下面是一个工作示例:

In [31]: sent_list = ['this is first sentence for demo purposes', 
'this is second sentence containing youth and youthful',
'this is 3rd sentence which is dummy one btw']

In [32]: hunt = 'youth'

# note that we need two `for` loops since the function takes list of sentences
In [33]: def hunter(sent_list):
...: for sent in sent_list:
...: for term in sent.split():
...: if hunt in term:
...: yield term
...:

In [34]: list(hunter(sent_list))
Out[34]: ['youth', 'youthful']

只是为了证明您也可以使用 term.find(hunt),因为您已经在使用它了:

In [35]: def hunter(sent_list):
...: for sent in sent_list:
...: for term in sent.split():
...: if term.find(hunt) is not -1:
...: yield term
...:

In [36]: list(hunter(sent_list))
Out[36]: ['youth', 'youthful']

关于python - 循环遍历字符串项列表并返回那些包含 python 子字符串的项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51349233/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com