gpt4 book ai didi

python - 抓取谷歌搜索片段结果

转载 作者:太空宇宙 更新时间:2023-11-04 05:01:42 25 4
gpt4 key购买 nike

我正在尝试编写一个小程序,你输入一个搜索查询,它会打开你的浏览器并显示结果,然后抓取谷歌搜索结果并打印出来,我不知道我会怎么做刮削部分。这是我到目前为止的全部:

import webbrowser 
query = input("What would you like to search: ")
for word in query:
query = query + "+"
webbrowser.open("https://www.google.com/search?q="+query)

假设他们输入:“唐纳德特朗普是谁?”他们的浏览器将打开,这将显示: donald trump search result

我将如何继续并抓取维基百科提供的摘要,然后将其打印回给用户?或者无论如何从网站上抓取任何数据???

最佳答案

要抓取摘要,您可以通过选择 CSS 选择器使用 bs4 提供的 select_one() 方法。您可以使用 SelectorGadget用于快速选择的 Chrome 扩展程序或任何其他扩展程序。

确保您使用的是 user-agent,否则 Google 可能会阻止您的请求,因为默认的 user-agent 将为 python-requests (如果您使用的是 requests 库)user-agents列表|伪造用户访问。

从那里您可以使用 select_one() 方法抓取您想要的所有其他部分。请记住,只有在 Google 提供的情况下,您才能从 Knowladge 图表中抓取信息。您可以制作一个 iftry-except 语句来处理异常。

代码和full example :

from bs4 import BeautifulSoup
import requests
import lxml

headers = {
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}

html = requests.get('https://www.google.com/search?q=who is donald trump', headers=headers).text

soup = BeautifulSoup(html, 'lxml')

summary = soup.select_one('.Uo8X3b+ span').text
print(summary)

输出:

Donald John Trump is an American media personality and businessman who served as the 45th president of the United States from 2017 to 2021.
Born and raised in Queens, New York City, Trump attended Fordham University and the University of Pennsylvania, graduating with a bachelor's degree in 1968.

另一种方法是使用 Google Knowledge Graph API来自 SerpApi。这是一个带有免费计划的付费 API。查看playground看看它是否适合您的需求。

要集成的示例代码:

import os
from serpapi import GoogleSearch

params = {
"engine": "google",
"q": "who is donald trump",
"api_key": os.getenv("API_KEY"),
}

search = GoogleSearch(params)
results = search.get_dict()

summary = results["knowledge_graph"]['description']
print(summary)

输出:

Donald John Trump is an American media personality and businessman who served as the 45th president of the United States from 2017 to 2021.
Born and raised in Queens, New York City, Trump attended Fordham University and the University of Pennsylvania, graduating with a bachelor's degree in 1968.

Disclaimer I work for SerpApi.

关于python - 抓取谷歌搜索片段结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45555330/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com