python - 为什么Tornado中的AsyncHTTPClient不立即发送请求？-6ren

python - 为什么Tornado中的AsyncHTTPClient不立即发送请求？

转载作者：太空宇宙更新时间：2023-11-03 18:02:26

25

4

在我当前的应用程序中，我使用 Tornado AsyncHttpClient 向网站发出请求。该流程很复杂，处理先前请求的响应会导致另一个请求。

实际上，我下载了一篇文章，然后分析它并下载其中提到的图像

令我困扰的是，在我的日志中，我清楚地看到一条消息，表明照片 URL 上的 .fetch() 已发出，但没有发出实际的 HTTP 请求，如 中嗅探的那样Wireshark

我尝试修改 max_client_count 和 Curl/Simple HTTP 客户端，但 bahvior 始终相同 - 直到所有文章下载完毕，才真正发出照片请求。如何改变这一点？

更新。一些伪代码

@VictorSergienko 我使用的是 Linux，所以默认情况下，我猜使用 EPoll 版本。整个系统太复杂，但归结为:

@gen.coroutine
def fetch_and_process(self, url, callback):
  body = yield self.async_client.fetch(url)
  res = yield callback(body)
  return res

@gen.coroutine
def process_articles(self,urls):
  wait_ids=[]
  for url in urls:
     #Enqueue but don't wait for one
     IOLoop.current().add_callback(self.fetch_and_process(url, self.process_article))
     wait_ids.append(yield gen.Callback(key=url))
  #wait for all tasks to finish
  yield wait_ids

@gen.coroutine
def process_article(self,body):
   photo_url=self.extract_photo_url_from_page(body)
   do_some_stuff()
   print('I gonna download that photo '+photo_url)
   yield self.download_photo(photo_url)

@gen.coroutine
def download_photo(self, photo_url):
  body = yield self.async_client.fetch(photo_url)
  with open(self.construct_filename(photo_url)) as f:
   f.write(body)

当它打印时我要下载那张照片没有提出实际请求!相反，它会继续下载更多文章并排队更多照片，直到下载所有文章，然后才批量请求所有照片

最佳答案

AsyncHTTPClient 有一个队列，您可以在 process_articles 中立即填充该队列(“入队但不要等待”)。当第一篇文章处理完时，它的照片将排在所有其他文章之后的队列末尾。

如果您在 process_articles 中使用 yield self.fetch_and_process 而不是 add_callback，您将在文章及其照片之间交替，但一次只能下载一项内容。要保持文章和照片之间的平衡，同时仍一次下载多个内容，请考虑使用 toro 包进行同步原语。 http://toro.readthedocs.org/en/stable/examples/web_spider_example.html 中的示例与您的用例类似。

关于python - 为什么Tornado中的AsyncHTTPClient不立即发送请求？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27469581/

25

4

0

文章推荐： c# - DRY (Don't Repeat Yourself) 和作业

文章推荐： html - 垂直对齐 Div 与绝对位置

文章推荐： ruby - Thinking_Sphinx 搜索选项

文章推荐： python - 在选项卡之间传递变化的数据

tornado - 优雅地处理 Tornado 应用程序中的应用程序异常
根据一些谷歌搜索，我安装了以下错误处理程序。然而，似乎返回 http 500 的 python 异常并没有被这些东西捕获，尽管 404 是这样。通过我在下面的代码中留下的打印语句，我可以看到它没有命中
tornado - Tornado write_message 并行屈服
我刚刚意识到 WebSocketHandler.write_message() 返回一个 Future。我以前没有在我的函数中产生过这个函数: @tornado.gen.coroutine
python Tornado : ImportError: No module named 'tornado'
这是我的 Tornado 文件:: from tornado.wsgi import WSGIContainer from tornado.ioloop import IOLoop from torn
python - Tornado - Tornado 中的 'Global variables'？
class MainHandler(BaseHandler): @tornado.web.authenticated def get(self): self.rende
python - Tornado 测试@tornado.web.authenticated
我正在尝试使用 AsyncHTTPTestCase 测试 Tornado .我想测试标有 @tornado.web.authenticated 注释的处理程序。因为此处理程序需要身份验证，所以我们必须
python - 如何在 Tornado 请求中执行 Tornado 请求
我正在使用 Tornado Web Server (版本 4.1)使用 Python 2.7 创建 REST Web 应用程序。我的请求处理程序之一 (web.RequestHandler) 使用多部
python - Tornado ioloop 和 Tornado 的工作流程是什么？
我想知道tornado 的内部工作流程，并且看过this article ，很好，但我就是想不通 ioloop.py里面有这样一个函数 def add_handler(self, fd, handle
python - 如何遍历从 Python/Tornado 处理程序传递到 Tornado 模板的字典？
如何遍历从 Python/Tornado 处理程序传递到 Tornado 模板的字典？我试过 {% for key, value in statistics %}
python - Tornado:来自 mysql 的更新数据未显示在由 Tornado 服务的前端中。重启服务器后才显示
我有一个 Tornado 后端，为 Angular 前端提供服务。更新数据库时，tornado api 不会获取更新的数据。它仅在我重新启动服务器后出现。有人可以帮我解决这个问题吗？我希望获取的数据能
python - 如何在 tornado.wsgi.WSGIContainer 中使用异步 tornado API？
我尝试使用自定义的 WSGIContainer 来处理异步操作: from tornado import httpserver, httpclient, ioloop, wsgi, gen @gen.
python - Tornado - 使用 render() 时找不到记录器 "tornado.application"的处理程序
from tornado.web import RequestHandler class HelloWorldHandler(RequestHandler): def get(self):
python - Pylint 和 Tornado - 在 @tornado.web.authenticated 上失败
Pylint 遇到 @tornado.web.authenticated 时崩溃 class Handler1(tornado.web.RequestHandler): def get(sel
python - tornado.gen.engine 与 tornado.gen.coroutine 的区别
经过 tornado.gen documentation有人可以帮我理解 tornado.gen.coroutine 和 tornado.gen.engine 之间的确切区别最佳答案正如 gen.
python - Tornado -redis : Why the 'listen' and the 'brpop' of tornado-redis can't work at the same time
代码如下: from tornadoredis import Client from tornado.ioloop import IOLoop from tornado.gen import coro
python - 在 Tornado 应用程序中使用 Django - 无法访问 Tornado 应用程序启动后创建的 MySQL 记录
我有一个 tornado.websocket.WebSocketHandler 的子类。在该类中，我有一个方法使用 Django ORM 从子类模型中获取用户:django.contrib.auth.
python-3.x - 无法从 Tornado Client 连接到基于 Tornado SSL 的服务器
我是 ssl 之类的新手，我已经使用 openssl 生成了自签名证书。 openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days
Tornado 6.0.3 从 4.2 : module 'tornado.web' has no attribute 'asynchronous'
我已经从 tornado 4.2 移动到 tornado 6.0.3，我得到了错误 AttributeError:模块“tornado.web”没有属性“异步” 根据 tornado v6 seems
python - 如何修复 Python 中的 'Install tornado itself to use zmq with the tornado IOLoop.' 警告
我一直在关注此 ( https://developer.ibm.com/tutorials/se-distributed-apps-zeromq-part2/) 教程，以设置使用 CurveZMQ 加
python - 集成 Tornado 与 celery : RuntimeError: tornado-redis must be installed to use the redis backend
我在使用tornado-celery整合tornado和celery时，出现错误:``` traceback (most recent call last): File "/usr/local/l
Tornado 认证
我正在使用 Tornado 与 twitter 等第三方进行身份验证。我的登录处理程序看起来像这样 class AuthLoginHandler(BaseHandler, tornado.auth.

首页

博学

6Ren·AI

商城

python - 为什么Tornado中的AsyncHTTPClient不立即发送请求？