gpt4 book ai didi

python - Scrapy 不遵循给定的请求

转载 作者:行者123 更新时间:2023-11-30 22:10:57 25 4
gpt4 key购买 nike

# -*- coding: utf-8 -*-
import logging

import scrapy
from scrapy.shell import inspect_response


class SuvlistingsSpider(scrapy.Spider):
name = 'SuvListings'
allowed_domains = ['https://www.gumtree.com.au']
start_urls = [
'https://www.gumtree.com.au/s-cars-vans-utes/sydney/carbodytype-suv/forsaleby-ownr/c18320l3003435/',
]

def parse(self, response):
self.log('Received response for listings page', level=logging.INFO)

main = response.css('.panel-body.panel-body--flat-panel-shadow.user-ad-collection__list-wrapper')[-1]
for a in main.css('a'):
req = response.follow(a, callback=self.parse_item)
yield req

def parse_item(self, response):
0/0
yield {
'price': response.xpath('normalize-space(//div[@id="ad-price"]/div/span[1])').extract(),
}

上面的代码没有触发异常。我在 Pycharm 中调试运行它。这是一个 anchor 选择器,如 scrapy 网站上的教程中所述,但没有任何内容被抓取。这里出了什么问题?

最佳答案

allowed_domains 中,您必须仅指定一个不带方案的域 (www.gumtree.com.au)。否则,scrapy 会阻止所有“异地”请求,认为其域与允许的域不匹配。

关于python - Scrapy 不遵循给定的请求,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51555661/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com