gpt4 book ai didi

python - 如何使用 beautifulsoup 解析来自特定站点的任何文章以及如何预览该链接的图像?

转载 作者:太空宇宙 更新时间:2023-11-04 09:49:09 25 4
gpt4 key购买 nike

我正在尝试从体育网站解析 html 提要中的新闻文章,我尝试使用以下代码,但出现“关键错误”

我试过的代码:

def get_cric_info_articles():

cricinfo_article_link = "http://www.espncricinfo.com/ci/content/story/news.html"

r = requests.get(cricinfo_article_link)
cricinfo_article_html = r.text

soup = BeautifulSoup(cricinfo_article_html, "html.parser")
# print(soup.prettify())

cric_info_items = soup.find_all("h2",
{"class": "story-title"})

cricinfo_article_dict = {}

for div in cric_info_items:
cricinfo_article_dict[div.find('a')['story-title']] = div.find('a')['href']

return cricinfo_article_dict

错误信息:

KeyError: 'story-title'

最佳答案

您要查找的值在 a 标记内

import requests
from bs4 import BeautifulSoup


def get_cric_info_articles():

cricinfo_article_link = "http://www.espncricinfo.com/ci/content/story/news.html"

r = requests.get(cricinfo_article_link)
cricinfo_article_html = r.text

soup = BeautifulSoup(cricinfo_article_html, "html.parser")
# print(soup.prettify())

cric_info_items = soup.find_all("h2",
{"class": "story-title"})
cricinfo_article_dict = {}

for div in cric_info_items:
cricinfo_article_dict[div.find('a').string] = div.find('a')['href']

return cricinfo_article_dict


print(get_cric_info_articles())

输出:

{'Bell-Drummond leads MCC in curtain-raiser': '/ci/content/story/1135157.html', 'Scotland pick Brad Wheal, Chris Sole for World Cup qualifiers': '/scotland/content/story/1135152.html', 'Newlands working to be water independent': '/southafrica/content/story/1135120.html', 'Scorchers bow out after Hurricanes pile up 210': '/australia/content/story/1135117.html', "'Strong evidence' of corruption in Ajman All Stars League - ICC ": '/ci/content/story/1135108.html', 'Du Plessis 120 powers South Africa to 269': '/south-africa-v-india-2018/content/story/1135099.html', "Plan is to expose India's middle, lower order - Harris": '/australia/content/story/1135091.html', 'Top order, King fire Scorchers into WBBL final': '/australia/content/story/1135084.html', 'Technical change brings prolific run for Mominul': '/bangladesh/content/story/1135077.html', 'Dhananjaya, Mendis lead strong Sri Lanka reply': '/bangladesh/content/story/1135075.html'}

关于python - 如何使用 beautifulsoup 解析来自特定站点的任何文章以及如何预览该链接的图像?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48567205/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com