gpt4 book ai didi

Python/Selenium "hover-and-click"不适用于类在悬停时发生变化的 WebElement

转载 作者:太空宇宙 更新时间:2023-11-04 04:42:34 25 4
gpt4 key购买 nike

我在 Python 上使用 Selenium 库来抓取一个用 js 编写的网站。我的策略是使用 selenium 浏览网站,并在适当的时候使用 BeautifulSoup 进行抓取。这在简单测试中工作得很好,除非如下图所示, I need to click on the "<" button.

按钮的“类”在悬停时发生变化,因此我使用 ActionChains 移动到元素并单击它(我还使用休眠来为浏览器加载页面提供足够的时间)。 Python 没有抛出任何异常,但点击不起作用(即日历没有向后移动)。

下面我提供了提到的网站和我编写的代码示例。您知道为什么会发生这种情况和/或我该如何解决这个问题?非常非常感谢。

网站 = https://burocomercial.profeco.gob.mx/index.jsp

代码:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import time

driver = webdriver.Chrome(path_to_webdriver)
driver.get('https://burocomercial.profeco.gob.mx/index.jsp') #access website

# Search bar and search button
search_bar = driver.find_elements_by_xpath('//*[@id="txtbuscar"]')
search_button = driver.find_element_by_xpath('//*[@id="contenido"]/div[2]/div[2]/div[2]/div/div[2]/div/button')

# Perform search
search_bar[0].send_keys("inmobiliaria")
search_button.click()

# Select result
time.sleep(2)
xpath='//*[@id="resultados"]/div[4]/table/tbody/tr[1]/td[5]/button'
driver.find_elements_by_xpath(xpath)[0].click()

# Open calendar
time.sleep(5)
driver.find_element_by_xpath('//*[@id="calI"]').click() #opens calendar
time.sleep(2)

# Hover-and-click on "<" (Here's the problem!!!)
cal_button=driver.find_element_by_xpath('//div[@id="ui-datepicker-div"]/div/a')
time.sleep(4)
ActionChains(driver).move_to_element(cal_button).perform() #hover
prev_button = driver.find_element_by_class_name('ui-datepicker-prev') #catch element whose class was changed by the hover
ActionChains(driver).click(prev_button).perform() #click
time.sleep(1)
print('clicked on it a second ago. No exception was raised, but the click was not performed')
time.sleep(1)

最佳答案

这是使用请求的不同方法。我认为 Selenium 应该是进行网络抓取时使用的最后一个选项。通常,可以从模拟 Web 应用程序发出的请求的网页中检索数据

import requests
from bs4 import BeautifulSoup as BS
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.101 Safari/537.36'}
## Starts session
s = requests.Session()
s.headers = headers
url_base = 'https://burocomercial.profeco.gob.mx/'
ind = 'index.jsp'
resp0 = s.get(url_base+ind) ## First request, to get the 'name' parameter that is dynamic
soup0 = BS(resp0.text, 'lxml')
param_name = soup0.select_one('input[id="txtbuscar"]')['name']
action = 'BusGeneral' ### The action when submit the form
keyword = 'inmobiliaria' # Word to search
data_buscar = {param_name:keyword,'yy':'2017'} ### Data submitted
resp1 = s.post(url_base+action,data=data_buscar) ## second request: make the search
resp2 = s.get(url_base+ind) # Third request: retrieve the results
print(resp2.text)
queja = 'Detalle_Queja.jsp' ## Action when Quejas selected
data_queja = {'Lookup':'2','Val':'1','Bus':'2','FI':'28-Nov-2016','FF':'28-Feb-2017','UA':'0'} # Data for queja form
## Lookup is the number of the row in the table, FI is the initial date and FF, the final date, UA is Unidad Administrativa
## You can change these parameters to obtain different queries.
resp3 = s.post(url_base+queja,data=data_queja) # retrieve Quejas results
print(resp3.text)

有了这个我得到了:

'\r\n\r\n\r\n\r\n\r\n\r\n1|<h2>ABITARE PROMOTORA E INMOBILIARIA, SA DE CV</h2>|0|0|0|0.00|0.00|0|0.00|0.00|0.00|0.00|0 % |0 % ||2'

其中包含网页中使用的数据。也许这个答案并不完全是您要找的,但我认为您可以更轻松地使用请求。

关于Python/Selenium "hover-and-click"不适用于类在悬停时发生变化的 WebElement,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50334940/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com