gpt4 book ai didi

python - Scrapy Pyinstaller OSError : could not get source code/twisted. internet.defer._DefGen_Return

转载 作者:行者123 更新时间:2023-12-04 12:02:45 28 4
gpt4 key购买 nike

我正在尝试使用 pyinstaller 将一个非常简单的爬虫蜘蛛发布为 .exe。
我已经搜索并阅读了我能找到的所有内容,但我仍然无法弄清楚出了什么问题。任何正确方向的帮助或指示都非常有用!

如果我将 yield 更改为 return 它不会给我错误并且可以正常工作,除了它只返回 1 个项目(这是正常的,因为它是 return 而不是 yield。)代码工作得很好,我的 IDE 中没有任何错误(不使用 pyinstaller .exe)

笔记:
我正在使用 pyinstaller 开发版本。

运行我的 .exe 时出错

2020-04-28 11:57:30 [scrapy.core.scraper] ERROR: Spider error processing <GET http://books.toscrape.com/> (referer: None)
Traceback (most recent call last):
File "lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
File "lib\site-packages\scrapy\core\downloader\middleware.py", line 42, in process_request
File "lib\site-packages\twisted\internet\defer.py", line 1362, in returnValue
twisted.internet.defer._DefGen_Return: <200 http://books.toscrape.com/>

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "lib\site-packages\scrapy\utils\defer.py", line 55, in mustbe_deferred
File "lib\site-packages\scrapy\core\spidermw.py", line 60, in process_spider_input
File "lib\site-packages\scrapy\core\scraper.py", line 148, in call_spider
File "lib\site-packages\scrapy\utils\misc.py", line 202, in warn_on_generator_with_return_value
File "lib\site-packages\scrapy\utils\misc.py", line 187, in is_generator_with_return_value
File "inspect.py", line 973, in getsource
File "inspect.py", line 955, in getsourcelines
File "inspect.py", line 786, in findsource
OSError: could not get source code

myBookSpider.py:
import scrapy
from items import scrapyStandaloneTestItem

class bookSpider(scrapy.Spider):

name = "bookSpider"
custom_settings = {
"FEED_URI" : "resultFile.csv",
"FEED_FORMAT" : "csv",
"FEED_EXPORT_FIELDS" : ["title", "price"]
}

def start_requests(self):

urls = [
"http://books.toscrape.com/",
]

for url in urls:

yield scrapy.Request(url=url, callback=self.parse)

def parse(self, response):

# Getting an instance of our item class
item = scrapyStandaloneTestItem()

# Getting all the article's with product pod class
articles = response.css("article.product_pod")

# Looping thru all the article elements we got earlier
for article in articles:

# Getting the needed values from the site and putting them in variables
title = article.css("a::attr(title)").extract()
price = article.css("p.price_color::text").extract()

# Setting the title / price variables in our items class equal to the variables that we just extracted data in to
item["title"] = title
item["price"] = price
yield item

项目.py:
import scrapy

class scrapyStandaloneTestItem(scrapy.Item):

# define the fields for your item here
title = scrapy.Field()
price = scrapy.Field()

运行Spider.py:
# In this file we will run the spider(s)
from scrapy.crawler import CrawlerProcess
from myBookSpider import bookSpider
from scrapy.utils.project import get_project_settings

def runSpider():

# Running scraper
process = CrawlerProcess(get_project_settings())
process.crawl(bookSpider)
process.start()

if (__name__ == "__main__"):

runSpider()

最佳答案

迟到的答案,但我会为其他人保留它,您所要做的就是将此代码添加到您的蜘蛛中,`
导入scrapy.utils.misc
导入scrapy.core.scraper

def warn_on_generator_with_return_value_stub(spider, callable):
pass

scrapy.utils.misc.warn_on_generator_with_return_value = warn_on_generator_with_return_value_stub
scrapy.core.scraper.warn_on_generator_with_return_value = warn_on_generator_with_return_value_stub`

关于python - Scrapy Pyinstaller OSError : could not get source code/twisted. internet.defer._DefGen_Return,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61478001/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com