gpt4 book ai didi

python - Scrapy + 飞溅 + ScrapyJS

转载 作者:太空狗 更新时间:2023-10-30 01:06:40 25 4
gpt4 key购买 nike

我正在使用 Splash 2.0.2 + Scrapy 1.0.5 + Scrapyjs 0.1.1,但我仍然无法通过点击渲染 javascript。这是一个示例网址 https://olx.pt/anuncio/loja-nova-com-250m2-garagem-em-box-fechada-para-arrumos-IDyTzAT.html#c49d3d94cf

我仍然看到没有呈现电话号码的页面:

class OlxSpider(scrapy.Spider):
name = "olx"
rotate_user_agent = True
allowed_domains = ["olx.pt"]
start_urls = [
"https://olx.pt/imoveis/"
]

def parse(self, response):
script = """
function main(splash)
splash:go(splash.args.url)
splash:runjs('document.getElementById("contact_methods").getElementsByTagName("span")[1].click();')
splash:wait(0.5)
return splash:html()
end
"""
for href in response.css('.link.linkWithHash.detailsLink::attr(href)'):
url = response.urljoin(href.extract())
yield scrapy.Request(url, callback=self.parse_house_contents, meta={
'splash': {
'args': {'lua_source': script},
'endpoint': 'execute',
}
})

for next_page in response.css('.pager .br3.brc8::attr(href)'):
url = response.urljoin(next_page.extract())
yield scrapy.Request(url, self.parse)

def parse_house_contents(self, response):

import ipdb;ipdb.set_trace()

我怎样才能让它工作?

最佳答案

添加

splash:autoload("https://code.jquery.com/jquery-2.1.3.min.js")

到 Lua 脚本,它会工作。

function main(splash)
splash:go(splash.args.url)
splash:autoload("https://code.jquery.com/jquery-2.1.3.min.js")
splash:runjs('document.getElementById("contact_methods").getElementsByTagName("span")[1].click();')
splash:wait(0.5)
return splash:html()
end

.click() 是 JQuery 函数 https://api.jquery.com/click/

关于python - Scrapy + 飞溅 + ScrapyJS,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35780666/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com