gpt4 book ai didi

python - 如何使用 python beautifulsoup 从下拉菜单中提取数据

转载 作者:行者123 更新时间:2023-12-02 04:23:28 24 4
gpt4 key购买 nike

我试图从一个具有多级下拉菜单的网站中抓取数据,每次选择一个项目时,它都会更改子下拉菜单的子项目。问题是,对于每个循环,它都会从下拉项中提取相同的子项。选择发生,但它不会代表循环中的新选择更新项目任何人都可以帮助我为什么我没有得到想要的结果。也许这是因为我的下拉列表是java脚本或其他东西。

例如下图中的手册: image of the dropdown我已经走了这么远:

enter code here

from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
import csv

//#from selenium.webdriver.support import Select
import time

print ("opening chorome....")
driver = webdriver.Chrome()
driver.get('https://www.wheelmax.com/')
time.sleep(10)

csvData = ['Year', 'Make', 'Model', 'Body', 'Submodel', 'Size']

//#variables
yeart = []
make= []
model=[]
body = []
submodel = []
size = []
Yindex = Mkindex = Mdindex = Bdindex = Smindex = Sindex = 0

print ("waiting for program to set variables....")
time.sleep(20)

print ("initializing and setting variables....")

//#initializing Year
Year = Select(driver.find_element_by_id("icm-years-select"))
Year.select_by_value('2020')
yr = driver.find_elements(By.XPATH, '//*[@id="icm-years-select"]')
time.sleep(15)

//#initializing Make
Make = Select(driver.find_element_by_id("icm-makes-select"))
Make.select_by_index(1)
mk = driver.find_elements(By.XPATH, '//*[@id="icm-makes-select"]')
time.sleep(15)

//#initializing Model
Model = Select(driver.find_element_by_id("icm-models-select"))
Model.select_by_index(1)
mdl = driver.find_elements(By.XPATH, '//*[@id="icm-models-select"]')
time.sleep(15)

//#initializing body
Body = Select(driver.find_element_by_id("icm-drivebodies-select"))
Body.select_by_index(1)
bdy = driver.find_elements(By.XPATH, '//*[@id="icm-drivebodies-select"]')
time.sleep(15)

//#initializing submodel
Submodel = Select(driver.find_element_by_id("icm-submodels-select"))
Submodel.select_by_index(1)
sbm = driver.find_elements(By.XPATH, '//*[@id="icm-submodels-select"]')
time.sleep(15)

//#initializing size
Size = Select(driver.find_element_by_id("icm-sizes-select"))
Size.select_by_index(0)
siz = driver.find_elements(By.XPATH, '//*[@id="icm-sizes-select"]')
time.sleep(5)


Cyr = Cmk = Cmd = Cbd = Csmd = Csz = ""

print ("fetching data from variables....")

for y in yr:
obj1 = driver.find_element_by_id("icm-years-select")
Year = Select(obj1)
Year.select_by_index(++Yindex)
obj1.click()
#obj1.click()
yeart.append(y.text)
Cyr = y.text
time.sleep(10)
for m in mk:
obj2 = driver.find_element_by_id("icm-makes-select")
Make = Select(obj2)
Make.select_by_index(++Mkindex)
obj2.click()
#obj2.click()
make.append(m.text)
Cmk = m.text
time.sleep(10)
for md in mdl:
Mdindex =0
obj3 = driver.find_element_by_id("icm-models-select")
Model = Select(obj3)
Model.select_by_index(++Mdindex)
obj3.click()
#obj3.click(clickobj)
model.append(md.text)
Cmd = md.text
time.sleep(10)
Bdindex = 0
for bd in bdy:
obj4 = driver.find_element_by_id("icm-drivebodies-select")
Body = Select(obj4)
Body.select_by_index(++Bdindex)
obj4.click()
#obj4.click(clickobj2)
body.append(bd.text)
Cbd = bd.text
time.sleep(10)
Smindex = 0
for sm in sbm:
obj5 = driver.find_element_by_id("icm-submodels-select")
Submodel = Select(obj5)
obj5.click()
Submodel.select_by_index(++Smindex)
#obj5.click(clickobj5)
submodel.append(sm.text)
Csmd = sm.text
time.sleep(10)
Sindex = 0
for sz in siz:
Size = Select(driver.find_element_by_id("icm-sizes-select"))
Size.select_by_index(++Sindex)
size.append(sz.text)
Scz = sz.text
csvData += [Cyr, Cmk, Cmd, Cbd,Csmd, Csz]

最佳答案

因为 https://www.wheelmax.com 具有相互依赖的多级下拉菜单,例如,如果您选择 Select Year 下拉选项,之后selected Year based on Select Make 下拉列表启用并显示基于所选年份选项的选项。

所以基本上你需要使用 Selenium 包来处理动态选项。

根据您的浏览器安装 selenium Web 驱动程序

下载 Chrome Web 驱动程序:

http://chromedriver.chromium.org/downloads

安装 Chrome 浏览器的网络驱动程序:

unzip ~/Downloads/chromedriver_linux64.zip -d ~/Downloads
chmod +x ~/Downloads/chromedriver
sudo mv -f ~/Downloads/chromedriver /usr/local/share/chromedriver
sudo ln -s /usr/local/share/chromedriver /usr/local/bin/chromedriver
sudo ln -s /usr/local/share/chromedriver /usr/bin/chromedriver

Selenium 教程

https://selenium-python.readthedocs.io/

例如。使用selenium选择多个下拉选项

from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time

driver = webdriver.Chrome()
driver.get('https://www.wheelmax.com/')
time.sleep(4)

selectYear = Select(driver.find_element_by_id("icm-years-select"))
selectYear.select_by_value('2019')

time.sleep(2)

selectMakes = Select(driver.find_element_by_id("icm-makes-select"))
selectMakes.select_by_value('58')

更新:

选择下拉选项值或计算选项总数

for option in selectYear.options:
print(option.text)

print(len(selectYear.options))

Se more

关于python - 如何使用 python beautifulsoup 从下拉菜单中提取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56320560/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com