gpt4 book ai didi

python - 从 Scrapy 的 RetryMiddleware 类继承时如何修复循环导入?

转载 作者:太空宇宙 更新时间:2023-11-04 10:00:08 25 4
gpt4 key购买 nike

我正在尝试改编 Scrapy 的 RetryMiddleware类,用复制粘贴的版本覆盖 _retry 方法,我只在其中添加了一行。我尝试按如下方式启动我的自定义中间件模块:

import scrapy.downloadermiddlewares.retry
from scrapy.utils.python import global_object_name

然而,这会产生一个

ImportError: cannot import name global_object_name

根据 ImportError: Cannot import name X ,这种错误是由循环导入引起的,但是在这种情况下我不能轻易地去除Scrapy源代码中的依赖。我该如何解决这个问题?

为了完整起见,这里是我正在尝试实现的 TorRetryMiddleware:

import logging
import scrapy.downloadermiddlewares.retry
from scrapy.utils.python import global_object_name
import apkmirror_scraper.tor_controller as tor_controller

logger = logging.getLogger(__name__)

class TorRetryMiddleware(scrapy.downloadermiddlewares.retry.RetryMiddleware):
def __init__(self, settings):
super(TorRetryMiddleware, self).__init__(settings)
self.retry_http_codes = {403, 429} # Retry on 403 ('Forbidden') and 429 ('Too Many Requests')

def _retry(self, request, reason, spider):
'''Same as original '_retry' method, but with a call to 'change_identity' before returning the Request.'''
retries = request.meta.get('retry_times', 0) + 1

stats = spider.crawler.stats
if retries <= self.max_retry_times:
logger.debug("Retrying %(request)s (failed %(retries)d times): %(reason)s",
{'request': request, 'retries': retries, 'reason': reason},
extra={'spider': spider})
retryreq = request.copy()
retryreq.meta['retry_times'] = retries
retryreq.dont_filter = True
retryreq.priority = request.priority + self.priority_adjust

if isinstance(reason, Exception):
reason = global_object_name(reason.__class__)

stats.inc_value('retry/count')
stats.inc_value('retry/reason_count/%s' % reason)

tor_controller.change_identity() # This line is added to the original '_retry' method

return retryreq
else:
stats.inc_value('retry/max_reached')
logger.debug("Gave up retrying %(request)s (failed %(retries)d times): %(reason)s",
{'request': request, 'retries': retries, 'reason': reason},
extra={'spider': spider})

最佳答案

我个人认为这个ImportError 不是来自循环导入。相反,您的 Scrapy 版本很可能还不包含 scrapy.utils.python.global_object_name

scrapy.utils.python.global_object_name 直到 this commit 才出现,它还不属于任何现有版本(最新版本是 v1.3.3)(不过它的目标版本是 v1.4)。

请确认您正在使用来自 GitHub 的代码,并且您的代码确实包含该提交。

已编辑:

关于:

According to ImportError: Cannot import name X, this type of error is caused by circular imports,

有很多原因可能导致ImportError。通常堆栈跟踪足以确定根本原因。例如

>>> import no_such_name
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named no_such_name

虽然循环导入应该有完全不同的堆栈跟踪,例如

[pengyu@GLaDOS-Precision-7510 tmp]$ cat foo.py 
from bar import baz
baz = 1
[pengyu@GLaDOS-Precision-7510 tmp]$ cat bar.py
from foo import baz
baz = 2
[pengyu@GLaDOS-Precision-7510 tmp]$ python -c "import foo"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/foo.py", line 1, in <module>
from bar import baz
File "/tmp/bar.py", line 1, in <module>
from foo import baz
ImportError: cannot import name 'baz'

关于python - 从 Scrapy 的 RetryMiddleware 类继承时如何修复循环导入?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43977262/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com