gpt4 book ai didi

How to make Selenium not wait till full page load, which has a slow script?(怎样才能让Selify不等到页面加载满,哪个脚本慢呢?)

转载 作者:bug小助手 更新时间:2023-10-25 18:51:29 28 4
gpt4 key购买 nike



Selenium driver.get (url) wait till full page load. But a scraping page try to load some dead JS script. So my Python script wait for it and doesn't works few minutes. This problem can be on every pages of a site.

Selify driver.get(Url)一直等到页面完全加载。但是一个抓取页面试图加载一些死的JS脚本。因此,我的Python脚本等待它,但不能工作几分钟。这个问题可能出现在网站的每一个页面上。



from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.cortinadecor.com/productos/17/estores-enrollables-screen/estores-screen-corti-3000')
# It try load: https://www.cetelem.es/eCommerceCalculadora/resources/js/eCalculadoraCetelemCombo.js
driver.find_element_by_name('ANCHO').send_keys("100")


How to limit the time wait, block AJAX load of a file, or is other way?

如何限制等待时间,阻止AJAX加载文件,还是其他方法?



Also I test my script in webdriver.Chrome(), but will use PhantomJS(), or probably Firefox(). So, if some method uses a change in browser settings, then it must be universal.

我还在webdriver.Chrome()中测试了我的脚本,但将使用PhantomJS(),或者可能使用Firefox()。因此,如果某个方法使用浏览器设置中的更改,那么它必须是通用的。


更多回答
优秀答案推荐

When Selenium loads a page/url by default it follows a default configuration with pageLoadStrategy set to normal. To make Selenium not to wait for full page load we can configure the pageLoadStrategy. pageLoadStrategy supports 3 different values as follows:

当Selify默认加载一个页面/url时,它遵循pageLoadStrategy设置为Normal的默认配置。要使Selify不等待整个页面加载,我们可以配置pageLoadStrategy。PageLoadStrategy支持3种不同的值,具体如下:




  1. normal (full page load)

  2. eager (interactive)

  3. none



Here is the code block to configure the pageLoadStrategy :

以下是配置pageLoadStrategy的代码块:




  • Firefox :

    火狐浏览器:



    from selenium import webdriver
    from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

    caps = DesiredCapabilities().FIREFOX
    caps["pageLoadStrategy"] = "normal" # complete
    #caps["pageLoadStrategy"] = "eager" # interactive
    #caps["pageLoadStrategy"] = "none"
    driver = webdriver.Firefox(desired_capabilities=caps, executable_path=r'C:\path\to\geckodriver.exe')
    driver.get("http://google.com")

  • Chrome :

    铬:



    from selenium import webdriver
    from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

    caps = DesiredCapabilities().CHROME
    caps["pageLoadStrategy"] = "normal" # complete
    #caps["pageLoadStrategy"] = "eager" # interactive
    #caps["pageLoadStrategy"] = "none"
    driver = webdriver.Chrome(desired_capabilities=caps, executable_path=r'C:\path\to\chromedriver.exe')
    driver.get("http://google.com")




Note : pageLoadStrategy values normal, eager and none is a requirement as per WebDriver W3C Editor's Draft but pageLoadStrategy value as eager is still a WIP (Work In Progress) within ChromeDriver implementation. You can find a detailed discussion in “Eager” Page Load Strategy workaround for Chromedriver Selenium in Python




Based on the selenium docs V4.0 it now seems to be like this:

基于Selify Docs V4.0,它现在看起来是这样的:


from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'none'
driver = webdriver.Chrome(options=options)
driver.get("http://www.google.com")
driver.quit()


@undetected Selenium answer works well but for the chrome, part its not working use the below answer for chrome

@未检测到的Selify答案运行良好,但对于Chrome,部分答案不起作用,请使用下面的答案


from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
capa = DesiredCapabilities.CHROME
capa["pageLoadStrategy"] = "none"
browser= webdriver.Chrome(desired_capabilities=capa,executable_path='PATH',options=options)


更多回答

It works in Firefox(). In Chrome() the "eager" option throws an error "unsupported". Run follows: caps = DesiredCapabilities().CHROME caps["pageLoadStrategy"] = "none" driver = webdriver.Chrome(desired_capabilities=caps) driver.get('href...') time.sleep(5) driver.find_element_by_name('ANCHO').send_keys("100")

它可以在Firefox()中运行。在Chrome()中,“eight”选项抛出错误“unsupported”。运行如下:caps=DesiredCapables().ChROME CAPS[“pageLoadStrategy”]=“None”驱动程序=webdriver.Chrome(Desired_Capability=caps)driver.get(‘href...’)时间.睡眠(5)driver.find_element_by_name(‘ANCHO’).send_keys(“100”)

@bl79 Yes :) I know. What I suggested is from WebDriver's W3C recommendation. ChromeDriver will follow the suit soon. Thanks

@bl79是的:)我知道。我的建议来自WebDriver的W3C推荐标准。ChromeDriver很快也会效仿。谢谢

Instead of time.sleep, better to use driver.implicitly_wait

而不是时间。睡眠,最好使用驱动程序。隐含_等待

Chrome still hasn't followed suit and it doesn't seem like they will

Chrome仍然没有效仿,他们似乎也不会效仿

@TimWachter Checkout my answer update and let me know your thoughts.

@TimWachter查看我的答案更新,让我知道你的想法。

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com