gpt4 book ai didi

Python Scrapy 函数调用

转载 作者:太空宇宙 更新时间:2023-11-04 02:47:52 26 4
gpt4 key购买 nike

我尝试从 scrappy 调用的主要解析函数调用 getNext() 函数,但它从未被调用过。

class BlogSpider(scrapy.Spider):
# User agent.
name = 'Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.133 Mobile Safari/535.19'
start_urls = ['http://www.tricksforums.org/best-free-movie-streaming-sites-to/']

def getNext(self):
print("Getting next ... ")
# Check if next link in DB is valid and crawl.
try:
nextUrl = myDb.getNextUrl()
urllib.urlopen(nextUrl).getcode()
yield scrapy.Request(nextUrl['link'])
except IOError as e:
print("Server can't be reached", e.code)
yield self.getNext()

def parse(self, response):
print("Parsing link: ", response.url)
# Get all urls for futher crawling.
all_links = hxs.xpath('*//a/@href').extract()
for link in all_links:
if validators.url(link) and not myDb.existUrl(link) and not myDb.visited(link):
myDb.addUrl(link)
print("Getting next?")
yield self.getNext()

我之前尝试过使用和不使用 yield .. 有什么问题吗?这个产量应该是多少? :)

最佳答案

您正在尝试产生一个生成器,但本意是从一个生成器产生

如果您使用的是 Python 3.3+,则可以使用 yield from:

yield from self.getNext()

或者,只需执行 return self.getNext()

关于Python Scrapy 函数调用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44638287/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com