gpt4 book ai didi

python - 我不确定如何从 HTML 打印我需要的其余信息

转载 作者:行者123 更新时间:2023-12-05 05:39:35 26 4
gpt4 key购买 nike

import requests
from bs4 import BeautifulSoup
from datetime import datetime
from dateutil.relativedelta import relativedelta

evr_begin = datetime.now().strftime("%m/%d/%Y")
evr_end = (datetime.now() + relativedelta(months=1)).strftime("%m/%d/%Y")
url = "https://mms.kcbs.us/members/evr_search_ol_json.php?" \
f"otype=TEXT&evr_map_type=2&org_id=KCBA&evr_begin={evr_begin}&evr_end=.
{evr_end}&" \
"evr_radius=50&evr_type=269&evr_region_type=1"
response = requests.request("GET", url)
soup = BeautifulSoup(response.text, features='lxml')
for event in soup.find_all('div', class_='row'):
print(event.find('b').getText())
print(event.find('i').getText())

网站链接 https://mms.kcbs.us/members/evr_search.php?org_id=KCBA

我不确定如何打印我已经打印的信息之后的内容。部分问题是其他一些文本共享相同的标签,而其他一些我不确定。

例如我需要打印的第一个事件

弗里斯科,科罗拉多州 80443美国州锦标赛代表:BUNNY TUTTLE、RICH TUTTLE、MICHAEL WINTER奖金:13,050.00 美元

全部分开。
如果我使用print(event.find('div', class_='col-md-4').getText()) 在 for 循环中它会将它打印成一团

最佳答案

我要做的是创建一个字典,其中包含映射到它们在表的每一行中出现的顺序的不同数据片段的所有名称。然后将每一行收集到它自己的字典中,并将它们附加到一个列表中,以便您在所有解析完成后进行处理。

例如:

import requests
from bs4 import BeautifulSoup
from datetime import datetime
from dateutil.relativedelta import relativedelta
import json

data = {
0:{ 0:"title", 1:"dates", 2:"city/state", 3:"country" },
1:{ 0:"event", 1:"reps", 2:"prize" },
2:{ 0:"results" }
}

evr_begin = datetime.now().strftime("%m/%d/%Y")
evr_end = (datetime.now() + relativedelta(months=1)).strftime("%m/%d/%Y")
url = f"https://mms.kcbs.us/members/evr_search_ol_json.php?otype=TEXT&evr_map_type=2&org_id=KCBA&evr_begin={evr_begin}&evr_end=.{evr_end}&evr_radius=50&evr_type=269&evr_region_type=1"
response = requests.request("GET", url)
print(response.content)
soup = BeautifulSoup(response.text, features='lxml')
all_data = []
for element in soup.find_all('div', class_="row"):
event = {}
for i, col in enumerate(element.find_all('div', class_='col-md-4')):
for j, item in enumerate(col.strings):
event[data[i][j]] = item
all_data.append(event)

print(json.dumps(all_data,indent=4))

输出看起来像这样:

 {
"title": "Frisco BBQ Challenge",
"dates": "6/16/2022 - 6/18/2022",
"city/state": "Frisco, CO 80443",
"country": "UNITED STATES",
"event": "STATE CHAMPIONSHIP",
"reps": "Reps: BUNNY TUTTLE, RICH TUTTLE, MICHAEL WINTER",
"prize": "Prize Money: $13,050.00",
"results": "Results Not In"
},
{
"title": "York County BBQ Festival",
"dates": "6/17/2022 - 6/18/2022",
"city/state": "Delta, PA 17314",
"country": "UNITED STATES",
"event": "STATE CHAMPIONSHIP",
"reps": "Reps: ANGELA MCKEE, ROBERT MCKEE, LOUISE WEIDNER",
"prize": "Prize Money: $5,500.00",
"results": "Results Not In"
},
...

关于python - 我不确定如何从 HTML 打印我需要的其余信息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72638787/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com