python - 外部 API RabbitMQ 和 Celery 速率限制-6ren

python - 外部 API RabbitMQ 和 Celery 速率限制

转载作者：太空宇宙更新时间：2023-11-04 05:24:24

24

4

我正在使用一个外部 REST API，它将我的 API 请求限制在 1 CPS。

这是以下架构:

版本:

flask
RabbitMQ 3.6.4
AMPQ 1.4.9
昆布 3.0.35
celery 3.1.23
python 2.7

API 客户端向内部 API 发送 Web 请求，API 处理请求并控制发送到 RabbitMQ 的速率。这些任务可能需要 5 秒到 120 秒，并且在某些情况下任务可能会排队，并且它们以比定义的速率更高的速率发送到外部 API，从而导致大量请求失败。 (导致大约 5% 的请求失败)

可能的解决方案:

增加外部 API 限制
添加更多 worker
跟踪失败的任务并稍后重试

虽然这些解决方案可能有效，但并不能完全解决我的速率限制器的实现问题，也不能控制我的工作人员可以处理 API 请求的实际速率。稍后我真的需要控制外部速率。

我相信，如果我可以控制 RabbitMQ 速率限制，消息可以发送给工作人员，这可能是一个更好的选择。我找到了 rabbitmq 预取选项，但不确定是否有人可以推荐其他选项来控制向消费者发送消息的速率？

最佳答案

您需要创建自己的速率限制器，因为 Celery 的速率限制仅适用于每个工作人员，并且“不会像您期望的那样工作”。

我个人发现它在尝试从另一个任务添加新任务时完全崩溃。

我认为速率限制的要求范围太广并且取决于应用程序本身，因此 Celery 的实现有意过于简单。

这是我使用 Celery + Django + Redis 创建的示例。基本上，它向您的 App.Task 类添加了一个方便的方法，它将跟踪您在 Redis 中的任务执行率。如果它太高，任务将在稍后重试。

此示例以发送 SMTP 消息为例，但可以轻松替换为 API 调用。

算法灵感来自于Figma https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/

https://gist.github.com/Vigrond/2bbea9be6413415e5479998e79a1b11a

# Rate limiting with Celery + Django + Redis
# Multiple Fixed Windows Algorithm inspired by Figma https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/
#   and Celery's sometimes ambiguous, vague, and one-paragraph documentation
#
# Celery's Task is subclassed and the is_rate_okay function is added


# celery.py or however your App is implemented in Django
import os
import math
import time

from celery import Celery, Task
from django_redis import get_redis_connection
from django.conf import settings
from django.utils import timezone


app = Celery('your_app')

# Get Redis connection from our Django 'default' cache setting
redis_conn = get_redis_connection("default")

# We subclass the Celery Task
class YourAppTask(Task):
  def is_rate_okay(self, times=30, per=60):
    """
      Checks to see if this task is hitting our defined rate limit too much.
      This example sets a rate limit of 30/minute.

      times (int): The "30" in "30 times per 60 seconds".
      per (int):  The "60" in "30 times per 60 seconds".

      The Redis structure we create is a Hash of timestamp keys with counter values
      {
        '1560649027.515933': '2',  // unlikely to have more than 1
        '1560649352.462433': '1',
      }

      The Redis key is expired after the amount of 'per' has elapsed.
      The algorithm totals the counters and checks against 'limit'.

      This algorithm currently does not implement the "leniency" described 
      at the bottom of the figma article referenced at the top of this code.
      This is left up to you and depends on application.

      Returns True if under the limit, otherwise False.
    """

    # Get a timestamp accurate to the microsecond
    timestamp = timezone.now().timestamp()

    # Set our Redis key to our task name
    key = f"rate:{self.name}"

    # Create a pipeline to execute redis code atomically
    pipe = redis_conn.pipeline()

    # Increment our current task hit in the Redis hash
    pipe.hincrby(key, timestamp)

    # Grab the current expiration of our task key
    pipe.ttl(key)

    # Grab all of our task hits in our current frame (of 60 seconds)
    pipe.hvals(key)

    # This returns a list of our command results.  [current task hits, expiration, list of all task hits,]
    result = pipe.execute()

    # If our expiration is not set, set it.  This is not part of the atomicity of the pipeline above.
    if result[1] < 0:
        redis_conn.expire(key, per)

    # We must convert byte to int before adding up the counters and comparing to our limit
    if sum([int(count) for count in result[2]]) <= times:
        return True
    else:
        return False


app.Task = YourAppTask
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()

...

# SMTP Example
import random
from YourApp.celery import app
from django.core.mail import EmailMessage

# We set infinite max_retries so backlogged email tasks do not disappear
@app.task(name='smtp.send-email', max_retries=None, bind=True)
def send_email(self, to_address):

    if not self.is_rate_okay():
        # We implement a random countdown between 30 and 60 seconds 
        #   so tasks don't come flooding back at the same time
        raise self.retry(countdown=random.randint(30, 60))

    message = EmailMessage(
        'Hello',
        'Body goes here',
        'from@yourdomain.com',
        [to_address],
    )
    message.send()

关于python - 外部 API RabbitMQ 和 Celery 速率限制，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39312700/

24

4

0

文章推荐： c - 相同但为什么工作方式不同？

文章推荐： c - 如何将另一个文件中的函数调用到c中的main函数中？

文章推荐： linux - 在 find 命令中查找文件数和文件名，忽略目录

文章推荐： c - 将秒和纳秒转换为微秒的最快(最佳时间)方式

api - Azure API 管理 - API 端点域与实际 API URL
我已经设置了 Azure API 管理服务，并在自定义域上配置了它。在 Azure 门户中 API 管理服务的配置部分下，我设置了以下内容: 因为这是一个客户端系统，我必须屏蔽细节，但以下是基础知识:
api - 使用 API key 获取 API(Twitter API)
我是一名习惯 React Native 的新程序员。我最近开始学习 Fetch API 及其工作原理。我的问题是，我找不到人们使用 API key 在他们的获取语句中访问信息的示例(我很难清楚地表达有
api - 插件 API 与类库 API
这里有很多关于 API 是什么的东西，但是我找不到我需要的关于插件 API 和类库 API 之间的区别。反正我不明白。在 Documenting APIs 一书中，我读到:插件 API 和类库 AP
api - 谷歌博客搜索 API 的替代 API
关闭。这个问题不满足Stack Overflow guidelines .它目前不接受答案。想改善这个问题吗？更新问题，使其成为 on-topic对于堆栈溢出。 7年前关闭。 Improve thi
api - 在现有 API 中使用多个第三方 API 的最佳实践
我正在尝试找出设计以下场景的最佳方法。假设我已经有了一个 REST API 实现，它将从不同的供应商那里获取书籍并将它们返回给我自己的客户端。每个供应商都提供单独的 API 来向其消费者提供图书。
api - REST API 和 API key
请有人向我解释如何使用 api key 以及它有什么用处。我对此进行了很多搜索，但得到了不同且相互矛盾的答案。有人说 API key 是保密的，它从不作为通信的一部分发送，而其他人则将它发送给客户端
api - Flickr api 与 Picasa api
关闭。这个问题是opinion-based .它目前不接受答案。想改进这个问题？更新问题，以便 editing this post 可以用事实和引用来回答它. 4年前关闭。 Improve this
api - WSO2 API Manager API 认证失败
谁能告诉我为什么 WSo2 API 管理器不进行身份验证？我已经设置了两个 WSo2 API Manager 1.8.0 实例并创建了一个 api。它作为原型(prototype) api 工作正常。
api - Fluent API 与其他 API 有何不同？
我在学习 DSL 的过程中遇到了 Fluent API。我在流利的 API 上搜索了很多……我可以得出的基本结论是，流利的 API 使用方法链来使代码流利。但我无法理解——在面向对象的语言中，我们
api - WSO2 API 管理器是否支持 API 联合？
基本上，我感兴趣的是在多个区域设置 WSO2 API 管理器；例如亚洲、美国和欧洲。一些 API 将部署在每个区域的数据中心内，而其他 API 将仅部署在特定区域内。理想情况下，我想要的是一个单一的
api - 使用 API key 保护我的 API
我正在构建自己的 API，供以下用户使用: 1) 安卓应用 2) 桌面应用我的网址之一是:http://api.chatapp.info/order_api/files/getbeers.php我的
api - 如何通过 API Key 授权谷歌分析 API
我需要向所有用户显示我的站点的分析，但使用 OAuth 它显示为登录用户配置的站点的分析。如何使用嵌入 API 实现仪表板但仅显示我的网站分析？我能想到的最好的可能性是使用 API key 而不是客
api - 提供 API 的公司是否在其 API 之前使用填充程序或代理？
我正在研究大公司如何管理其公共(public) API。我想到的是拥有成熟 API 的公司，例如 Google、Facebook、Twitter 和 Amazon。这些公司向公众公开了许多不同的 A
api - 显式 API 方法与广义的基于参数的 API 方法
在定义客户可访问的 API 时，以下是首选的行业惯例: a) 定义一组显式 API 方法，每个方法都有非常狭窄和特定的目的，例如: SetUserName SetUserAge Se
api - GAE API 资源管理器不显示 API，似乎卡在加载中
这在本地 deserver 和部署时都会发生。我成功地能够通过留言簿教程使用 API 资源管理器，但现在我已经创建了自己的项目并尝试访问我编写的第一个 API，它从未出现过。搜索栏旁边的黄色“正在加载
api - 尝试查询 API，但 api 响应为空
我正在尝试使用 http://ip-api.com/ api通过我的ip地址获取经度和纬度。当我访问 http://ip-api.com/json从我的浏览器或使用 curl，它以 json 格式返回
api - 流式 API 与 Rest API？
这里的典型示例是 Twitter 的 API。我从概念上理解 REST API 的工作原理，本质上它只是针对您的特定请求向他们的服务器查询，然后您会在其中收到响应(JSON、XML 等)，很棒。但是
api - 如何让其他 API 与您的 API 对话，而您的 API 又与 Twitter 对话？
我能想到的最好的标题，但要澄清的是，情况是这样的: 我正在开发一种类似短 url 的服务，该服务允许用户使用他们的 Twitter 帐户“登录”并发布内容。现在这项服务可以包含在 Tweetdeck
api - 平面与嵌套 API
我正在设计用于管理评论和讨论线程的 API 方案。我想有一个点 /discussions/:discussionId 当您GET 时，它会返回一组评论和一些元数据。评论也许可以单独访问 /discus
api - 后端和 API 是一样的吗？什么是后端 Web API？
关闭。这个问题需要更多focused .它目前不接受答案。想改进这个问题吗？更新问题，使其只关注一个问题 editing this post . 关闭去年。 Improve this quest

首页

博学

6Ren·AI

商城

python - 外部 API RabbitMQ 和 Celery 速率限制