callback - Airflow http回调传感器-6ren

callback - Airflow http回调传感器

转载作者：行者123 更新时间：2023-12-03 22:55:29

27

4

我们的 Airflow 实现发送 http 请求以获取服务来执行任务。我们希望这些服务在完成任务时让 Airflow 知道，因此我们向服务发送回调 url，当他们的任务完成时，他们将调用该服务。但是，我似乎找不到回调传感器。人们通常如何处理这种情况？

最佳答案

Airflow 中没有回调或 webhook 传感器之类的东西。传感器定义如下，取自文档:

Sensors are a certain type of operator that will keep running until a certain criterion is met. Examples include a specific file landing in HDFS or S3, a partition appearing in Hive, or a specific time of the day. Sensors are derived from BaseSensorOperator and run a poke method at a specified poke_interval until it returns True.

这意味着传感器是执行 的操作符。投票外部系统的行为。从这个意义上说，您的外部服务应该有一种方法来保持每个执行任务的状态 - 无论是内部还是外部 - 以便轮询传感器可以检查该状态。

这样你就可以使用例如 airflow.operators.HttpSensor轮询 HTTP 端点，直到满足条件。或者更好的是，编写您自己的自定义传感器，让您有机会进行更复杂的处理并保持状态。

否则，如果服务在存储系统中输出数据，您可以使用例如轮询数据库的传感器。我相信你明白了。

我附上了一个我为与 Apache Livy API 集成而编写的自定义运算符示例。传感器做两件事:a) 通过 REST API 提交 Spark 作业，b) 等待作业完成。

运算符扩展 SimpleHttpOperator 同时实现了 HttpSensor 从而结合了这两种功能。

class LivyBatchOperator(SimpleHttpOperator):
"""
Submits a new Spark batch job through
the Apache Livy REST API.

"""

template_fields = ('args',)
ui_color = '#f4a460'

@apply_defaults
def __init__(self,
             name,
             className,
             file,
             executorMemory='1g',
             driverMemory='512m',
             driverCores=1,
             executorCores=1,
             numExecutors=1,
             args=[],
             conf={},
             timeout=120,
             http_conn_id='apache_livy',
             *arguments, **kwargs):
    """
    If xcom_push is True, response of an HTTP request will also
    be pushed to an XCom.
    """
    super(LivyBatchOperator, self).__init__(
        endpoint='batches', *arguments, **kwargs)

    self.http_conn_id = http_conn_id
    self.method = 'POST'
    self.endpoint = 'batches'
    self.name = name
    self.className = className
    self.file = file
    self.executorMemory = executorMemory
    self.driverMemory = driverMemory
    self.driverCores = driverCores
    self.executorCores = executorCores
    self.numExecutors = numExecutors
    self.args = args
    self.conf = conf
    self.timeout = timeout
    self.poke_interval = 10

def execute(self, context):
    """
    Executes the task
    """

    payload = {
        "name": self.name,
        "className": self.className,
        "executorMemory": self.executorMemory,
        "driverMemory": self.driverMemory,
        "driverCores": self.driverCores,
        "executorCores": self.executorCores,
        "numExecutors": self.numExecutors,
        "file": self.file,
        "args": self.args,
        "conf": self.conf
    }
    print payload
    headers = {
        'X-Requested-By': 'airflow',
        'Content-Type': 'application/json'
    }

    http = HttpHook(self.method, http_conn_id=self.http_conn_id)

    self.log.info("Submitting batch through Apache Livy API")

    response = http.run(self.endpoint,
                        json.dumps(payload),
                        headers,
                        self.extra_options)

    # parse the JSON response
    obj = json.loads(response.content)

    # get the new batch Id
    self.batch_id = obj['id']

    log.info('Batch successfully submitted with Id %s', self.batch_id)

    # start polling the batch status
    started_at = datetime.utcnow()
    while not self.poke(context):
        if (datetime.utcnow() - started_at).total_seconds() > self.timeout:
            raise AirflowSensorTimeout('Snap. Time is OUT.')

        sleep(self.poke_interval)

    self.log.info("Batch %s has finished", self.batch_id)

def poke(self, context):
    '''
    Function that the sensors defined while deriving this class should
    override.
    '''

    http = HttpHook(method='GET', http_conn_id=self.http_conn_id)

    self.log.info("Calling Apache Livy API to get batch status")

    # call the API endpoint
    endpoint = 'batches/' + str(self.batch_id)
    response = http.run(endpoint)

    # parse the JSON response
    obj = json.loads(response.content)

    # get the current state of the batch
    state = obj['state']

    # check the batch state
    if (state == 'starting') or (state == 'running'):
        # if state is 'starting' or 'running'
        # signal a new polling cycle
        self.log.info('Batch %s has not finished yet (%s)',
                      self.batch_id, state)
        return False
    elif state == 'success':
        # if state is 'success' exit
        return True
    else:
        # for all other states
        # raise an exception and
        # terminate the task
        raise AirflowException(
            'Batch ' + str(self.batch_id) + ' failed (' + state + ')')

希望这会对你有所帮助。

关于callback - Airflow http回调传感器，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51566029/

27

4

0

文章推荐： msdeploy - 如何在 MSDeploy.exe 命令行上设置目标网站

文章推荐： Jquery ajax，当定期调用时，在 IE8 上不起作用

文章推荐： jquery - 单击链接时如何启动 jquery 对话框

java - RxJava : what is difference between callbacks in doOnError ('callback' ) and subscribe(*, 'callback')
在我的上一个项目中，我使用了 rxJava，我意识到 observable.doOnError('onErrorCallback').subscribe(action) 和 observable.su
c++ - 为什么 (*callback)() 有效，而 *callback() 或 *callback 在 C++ 中不起作用
我是一名 C++ 初学者，我认为要真正学习指针和引用，我应该尝试创建一个回调函数，这是我在 JavaScript 中认为理所当然的事情。但是，对于我的一生，我不知道为什么这些括号在 (*callba
callback - 达特朗 : How to implemet a callback method
我在库中有一个类，它具有在事件发生时执行的“onMessage”方法。 OnMessage 在执行时需要调用属于主应用程序中的类的“回调”方法。我假设这将通过构造函数完成，但我不知道它是如何实现的。
jquery - $.Callbacks().disable() 与 $.Callbacks().lock()
两者的 jQuery 文档基本上说明了相同的事情，所以我想知道两者之间是否有任何重大差异(如果有的话)。谢谢! 最佳答案这方面的文档实际上非常糟糕，所以这是我在 studying the sourc
javascript - callback && callback() 在 javascript 中是什么意思
这个问题在这里已经有了答案: Using &&'s short-circuiting as an if statement? (6 个答案) Omitting the second expressi
callback - 戈朗 : evaluate variable in callback declaration
我正在尝试在 golang 中定义一个回调: package main func main() { x, y := "old x ", "old y" callback
callback - 谷歌图表 API : adding arguments to existing callback
我有一个页面，其中包含从 Google 电子表格生成的许多图表。典型代码如下所示: var url = "http://my.googlespreadsheet.com/tq?argumentshe
callback - 订阅已弃用 : Use an observer instead of an error callback
当我运行 linter 时，它显示: subscribe is deprecated: Use an observer instead of an error callback 代码来自 this a
c# - Callback 与 new AsyncCallback(Callback) 有什么不同？
对于异步套接字 // accept ... listener.BeginAccept( new AsyncCallback(AcceptCallback), listener); // listene
javascript - 根据 callback(true)/callback(false) 执行操作
我希望能够根据在前面的函数中调用的是 callback(true) 还是 callback(false) 在回调函数中执行一些逻辑。示例: foo.doFunction = function (pa
javascript - jQuery:以下代码块中比较的含义是什么 - callback && function() {callback.call();}
从 jQuery.scrollTo.js 库中看到这个 block (在 v1.4 的第 184 行)。 function animate( callback ){ $elem.animate
android - "callback(value)"和 "callback.invoke(value)"有什么区别？
我正在尝试在我的应用中使用一些回调，它与 "callback(value)" 和 "callback.invoke(value)" 一起工作正确调用回调。我想知道“回调(值)”是否只是一个缩短版本，
python - 访问 tf.keras.callbacks.Callback 中已弃用的属性 "validation_data"
我决定从 keras 切换到 tf.keras(建议使用 here)。因此我安装了 tf.__version__=2.0.0和 tf.keras.__version__=2.2.4-tf .在我的旧版
javascript - nodejs return callback() 和 just callback() 有什么区别
我认为这实际上可能会回答我关于 Stack Overflow 的另一个问题如果我能确认这一点。返回回调和只调用回调有什么区别？我看到代码执行其中之一/或/两者，并试图思考为什么以及何时执行哪个。
callback - 如何 : Idiomatic Rust for callbacks with gtk (rust-gnome)
我目前正在学习 Rust 并希望用它来开发 GUI基于 GTK+ 的应用程序。我的问题与注册回调有关在这些回调中响应 GTK 事件/信号和变异状态。我有一个有效但不优雅的解决方案，所以我想问一下是否有
javascript - React_Redux : Pass parameter in callback during continually callback function
我在回调函数中传递参数时遇到问题。我使用 redux-form，当我更改 SkinList 中的选择时，它会触发 onChange 回调 - activeSkinChange 方法在activeSk
javascript - Node : Dealing with Multiple Promise callbacks (Callback hell)
我有 8 个相互依赖的回调。我的想法是要有一个更具可读性的过程，但我不明白如何处理这个问题。我的回调 hell 的一个例子是: return new Promise(function (resolv
How do I use the result of a callback and pass that result to the next callback?(如何使用回调的结果并将该结果传递给下一个回调？)
因此，我的函数接受一个值和任意数量的回调作为参数(我应该使用扩散操作符吗？)该函数应该返回通过所有给定回调传递该值的最终结果。。我返回的“CB2(Res1)”不是一个函数。如何将第一个回调的结果传递给
callback - Vert.x 是 "based on callbacks"(而不是 future )是什么意思？
在谈到 future 和回调时，documentation说是 The Vert.x core APIs are based on callbacks to notify of asynchronou
Facebook 连接演示 - 是 "Callback URL"== "Connect Callback URL"吗？
我开始觉得自己很蠢。我正在关注 Facebook-Connect 演示“The Run Around”。当我导航到 http://www.[mysite].com/testing/register_

首页

博学

6Ren·AI

商城

callback - Airflow http回调传感器