gpt4 book ai didi

python - scrapy的Request函数没有被调用

转载 作者:行者123 更新时间:2023-12-01 05:40:40 26 4
gpt4 key购买 nike

我尝试从主解析函数调用解析函数,但它不起作用。

这是代码:

class CodechefSpider(CrawlSpider):
name = "codechef_crawler"
allowed_domains = ["codechef.com"]
start_urls = ["http://www.codechef.com/problems/easy/","http://www.codechef.com/problems/medium/","http://www.codechef.com/problems/hard/","http://www.codechef.com/problems/challenege/"]

rules = (Rule(SgmlLinkExtractor(allow=('/problems/[A-Z,0-9,-]+')), callback='parse_item'),)

def parse_solution(self,response):

hxs = HtmlXPathSelector(response)
x = hxs.select("//tr[@class='kol']//td[8]").exctract()
f = open('test/'+response.url.split('/')[-1]+'.txt','wb')
f.write(x.encode("utf-8"))
f.close()



def parse_item(self, response):
hxs = HtmlXPathSelector(response)
item = Problem()
item['title'] = hxs.select("//table[@class='pagetitle-prob']/tr/td/h1/text()").extract()
item['content'] = hxs.select("//div[@class='node clear-block']//div[@class='content']").extract()
filename = str(item['title'][0])
solutions_url = 'http://www.codechef.com/status/' + response.url.split('/')[-1] + '?language=All&status=15&handle=&sort_by=Time&sorting_order=asc'
Request(solutions_url, callback = self.parse_solution)
f = open('problems/'+filename+'.html','wb')
f.write("<div style='width:800px;margin:50px'>")
for i in item['content']:
f.write(i.encode("utf-8"))
f.write("</div>")
f.close()

解析解决方案方法未被调用。蜘蛛运行没有任何错误。

最佳答案

您应该放置yield Request(solutions_url,callback = self.parse_solution),而不仅仅是Request(solutions_url,callback = self.parse_solution)

关于python - scrapy的Request函数没有被调用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17631190/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com