gpt4 book ai didi

python - 比较三个文本列表以查看匹配的单词

转载 作者:太空宇宙 更新时间:2023-11-03 21:05:33 26 4
gpt4 key购买 nike

您好,我编写了一个函数来读取和比较三个句子列表之间的单词,如果任何单词匹配,则该函数将返回文本,否则False,基本上采用在来自 selenium 的网络元素列表中并检查如果文本与任何关键字列表匹配,我想要做的是修改它,如果1个或3个或更多,则在检查后返回链接,即如果只有两个单词匹配则返回False.(如果任何单词匹配且其中一个关键词匹配链接,此函数将返回链接)我想要这个函数将返回链接,如果(1,3,4,5...)的单词匹配并且其中一个关键词匹配链接(只有0,2返回False)链接文本长度相等。

from selenium import webdriver
d = webdriver.Chrome(executable_path=r"C:\Users\test\PycharmProjects\chromedriver")
sentence = "hello world from python"
url_keywords = [".com",".edu"]
d.get("https://google.com/search?q={}".format(sentence))
y=d.find_elements_by_xpath("//a[@href]")
a=check(y,url_keywords)
li=[]
if a:
check(y)
else:
pass

def check(y,url_keywords):
links = [i.get('href') for i in y]
texts = [i.text_content() for i in y]
for i, link in enumerate(links):
for keyword in url_keywords:
if keyword in link:
for word in sentence.lower().split():
if word in texts[i].lower():
return link

return False

如果有更简单的方法,请指教

最佳答案

from selenium import webdriver

# Use descriptive names for variables, not single letters.
driver = webdriver.Chrome(executable_path=r"C:\Users\test\PycharmProjects\chromedriver")

# Use UPPERCASE for constants
SENTENCE = "hello world from python"
URL_KEYWORDS = [".com",".edu"]

driver.get("https://google.com/search?q={}".format(sentence))
elements = driver.find_elements_by_xpath("//a[@href]")
result = check(elements, url_keywords)


def check(elements, url_keywords):
links = [i.get('href') for i in elements]
texts = [i.text_content() for i in elements]

# Use zip to avoid so much nesting! Also means you can drop the index variable "i"
search_space = zip(links, texts)

for link, text in search_space:
#Let's keep track
number_of_matches = 0
for keyword in url_keywords:
# Create a separate function, again to avoid so much nesting! (see "Zen of Python")
match = is_match(keyword, link, text)
#If match is true int(match) will be 1, otherwise 0
number_of_matches += int(match)
if has_correct_number_of_matches(number_of_matches):
return link
else:
return False

def normalise(string):
"""
There is often quite a bit that we want to do to normalise strings. And you might want to extend this later. For this reason, I again make a new function, and also add in the "strip" method for good measure and as an example of extending the normalisation behaviour.
"""
return string.lower().strip()

def is_match(keyword, link, text):
if keyword in link:
for word in normalise(sentence).split():
if word in normalise(text):
return True
else:
return False
else:
return False

def has_correct_number_of_matches(number_of_matches):
"""Now that this function is isolated, you can define it however you want!
"""
return number_of_matches not in (0, 2)

关于python - 比较三个文本列表以查看匹配的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55433103/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com