gpt4 book ai didi

python - Soup-ify 获取请求

转载 作者:太空宇宙 更新时间:2023-11-04 11:07:36 26 4
gpt4 key购买 nike

我正在尝试 soup-ify 获取请求

from bs4 import BeautifulSoup
import requests
import pandas as pd

html_page = requests.get('"https://www.dataquest.io"')

soup = BeautifulSoup(html_page, "lxml")
soup.find_all('<\a>')

但是,这只会返回一个空列表

最佳答案

这将拉取表行并将每一行分配给一个字典,该字典附加到一个列表中。您可能需要稍微调整选择器。

from bs4 import BeautifulSoup
import requests
from pprint import pprint

output_data = [] # This is a LoD containing all of the table data

for i in range(1, 453): # For loop used to paginate
data_page = requests.get(f'https://www.dataquest.io?')
print(data_page)

soup = BeautifulSoup(data_page.text, "lxml")

# Find all of the table rows
elements = soup.select('div.head_table_t')
try:
secondary_elements = soup.select('div.list_table_subs')
elements = elements + secondary_elements
except:
pass
print(len(elements))
# Iterate through the rows and select individual column and assign it to the dictionary with the correct header
for element in elements:
data = {}
data['Name'] = element.select_one('div.col_1 a').text.strip()
data['Page URL'] = element.select_one('div.col_1 a')['href']
output_data.append(data) # Append dictionary (contact info) to the list
pprint(data) # Pretty Print the dictionary out (to see what you're receiving, this can be removed)

关于python - Soup-ify 获取请求,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59067700/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com