gpt4 book ai didi

python - ScrapyJS - 如何正确等待页面加载?

转载 作者:太空狗 更新时间:2023-10-30 00:54:39 25 4
gpt4 key购买 nike

我正在使用 ScrapyJS 和 Splash 来模拟表单提交按钮的点击

def start_requests(self):
script = """
function main(splash)
assert(splash:autoload("https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"))
assert(splash:go(splash.args.url))

local js = [[
var $j = jQuery.noConflict();
$j('#USER').val('frankcastle');
$j('#password').val('punisher');
$j('.button-oblong-orange.button-orange a').click();
]]

assert(splash:runjs(js))

local resumeJs = [[
function main(splash) {
var $j = jQuery.noConflict();
$j(document).ready(function(){
splash.resume();
})
}
]]

assert(splash:wait_for_resume(resumeJs))

return {
html = splash:html()
}
end
"""
splash_meta = {'splash': {'endpoint': 'execute', 'args': {'wait': 0.5, 'lua_source': script}}}

for url in self.start_urls:
yield scrapy.Request(url, self.after_login, meta=splash_meta)

def after_login(self, response):
print response.body
return

在完成 splash:runjs(js) 之后,我求助于 splash:wait(5) 尝试了 splash:wait_for_resume 得到结果。这可能并不总是有效(网络延迟),那么有更好的方法吗?

最佳答案

结果证明唯一的方法是使用 splash:wait() 但在循环中执行它并检查某些元素(如页脚)的可用性。

def start_requests(self):
script = """
function main(splash)
assert(splash:autoload("https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"))
assert(splash:go(splash.args.url))

local js = [[
var $j = jQuery.noConflict();
$j('#USER').val('frankcastle');
$j('#password').val('punisher');
$j('.button-oblong-orange.button-orange a').click();
$j('body').empty() // clear body, otherwise the wait_for footer will always be true
]]

assert(splash:runjs(js))

function wait_for(splash, condition)
while not condition() do
splash:wait(0.05)
end
end

wait_for(splash, function()
return splash:evaljs("document.querySelector('#footer') != null")
end)

return {
html = splash:html()
}
end
"""
splash_meta = {'splash': {'endpoint': 'execute', 'args': {'wait': 0.5, 'lua_source': script}}}

for url in self.start_urls:
yield scrapy.Request(url, self.after_login, meta=splash_meta)

关于python - ScrapyJS - 如何正确等待页面加载?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36400214/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com