gpt4 book ai didi

Python/Beautiful Soup - find_all limit 限制结果

转载 作者:行者123 更新时间:2023-12-01 01:25:16 28 4
gpt4 key购买 nike

我试图获取英超联赛的历史结果,但是虽然 html 获取了所有结果,但 Beautiful Soup find_all 只返回 200 个结果(应该有 463 个结果。有办法解决这个问题吗?

非常感谢

马特

import requests
from bs4 import BeautifulSoup
url = "https://www.skysports.com/premier-league-
results/1992-93"
url_content = requests.get(url).content
url_bs = BeautifulSoup(url_content,'html.parser')
match_list =
url_bs.find_all(attrs="class":"fixres__item"})
print(len(match_list))

最佳答案

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument('--headless')
driver = webdriver.Chrome(options=options)
driver.get('https://www.skysports.com/premier-league-results/1992-93')
WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CLASS_NAME, 'plus-more__text')))

if driver.find_element_by_class_name('plus-more__text'):
print('Found')
driver.execute_script("arguments[0].scrollIntoView();", driver.find_element_by_class_name('plus-more__text'))
driver.execute_script("arguments[0].click();", driver.find_element_by_class_name('plus-more__text'))

html = driver.page_source
soup = BeautifulSoup(html, 'lxml')

links = soup.findAll('div', class_='fixres__item')

print(len(links))


driver.quit()

关于Python/Beautiful Soup - find_all limit 限制结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53420366/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com