gpt4 book ai didi

python - Airflow 信号 SIGTERM 意外地子进程

转载 作者:太空狗 更新时间:2023-10-30 00:18:24 26 4
gpt4 key购买 nike

我正在使用 PythonOperator 来调用一个将数据工程过程并行化为 Airflow 任务的函数。这只需用 Airflow 调用的可调用包装函数包装一个简单函数即可完成。

def wrapper(ds, **kwargs):
process_data()

process_data 使用生成子进程的多处理模块实现并行化。当我从 jupyter notebook 单独运行 process_data 时,它运行到最后没有问题。但是,当我使用 Airflow 运行它时,任务失败并且任务日志显示类似这样的内容。

[2019-01-22 17:16:46,966] {models.py:1610} ERROR - Received SIGTERM. Terminating subprocesses.
[2019-01-22 17:16:46,969] {logging_mixin.py:95} WARNING - Process ForkPoolWorker-129:

[2019-01-22 17:16:46,973] {logging_mixin.py:95} WARNING - Traceback (most recent call last):

[2019-01-22 17:16:46,973] {logging_mixin.py:95} WARNING - File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()

[2019-01-22 17:16:46,973] {logging_mixin.py:95} WARNING - File "/home/airflow/.env/lib/python3.5/site-packages/airflow/models.py", line 1612, in signal_handler
raise AirflowException("Task received SIGTERM signal")

[2019-01-22 17:16:46,973] {logging_mixin.py:95} WARNING - airflow.exceptions.AirflowException: Task received SIGTERM signal

[2019-01-22 17:16:46,993] {models.py:1610} ERROR - Received SIGTERM. Terminating subprocesses.
[2019-01-22 17:16:46,996] {logging_mixin.py:95} WARNING - Process ForkPoolWorker-133:

[2019-01-22 17:16:46,999] {logging_mixin.py:95} WARNING - Traceback (most recent call last):

[2019-01-22 17:16:46,999] {logging_mixin.py:95} WARNING - File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()

[2019-01-22 17:16:46,999] {logging_mixin.py:95} WARNING - File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)

[2019-01-22 17:16:46,999] {logging_mixin.py:95} WARNING - File "/usr/lib/python3.5/multiprocessing/pool.py", line 108, in worker
task = get()

[2019-01-22 17:16:46,999] {logging_mixin.py:95} WARNING - File "/usr/lib/python3.5/multiprocessing/queues.py", line 343, in get
res = self._reader.recv_bytes()

[2019-01-22 17:16:46,999] {logging_mixin.py:95} WARNING - File "/usr/lib/python3.5/multiprocessing/synchronize.py", line 99, in __exit__
return self._semlock.__exit__(*args)

[2019-01-22 17:16:46,999] {logging_mixin.py:95} WARNING - File "/home/airflow/.env/lib/python3.5/site-packages/airflow/models.py", line 1612, in signal_handler
raise AirflowException("Task received SIGTERM signal")

[2019-01-22 17:16:46,999] {logging_mixin.py:95} WARNING - airflow.exceptions.AirflowException: Task received SIGTERM signal

[2019-01-22 17:16:47,086] {logging_mixin.py:95} INFO - file parsing and processing 256.07

[2019-01-22 17:17:12,938] {logging_mixin.py:95} INFO - combining and sorting 25.85

我不太确定任务收到 SIGTERM 的原因。我的猜测是一些更高级别的进程正在将它们发送到子进程。我应该如何调试这个问题?

刚刚注意到在任务日志的末尾,它清楚地指出

airflow.exceptions.AirflowException: Task received SIGTERM signal
[2019-01-22 12:31:39,196] {models.py:1764} INFO - Marking task as FAILED.

最佳答案

我遵循了答案 here .想法是一样的:不要让 Airflow 过早关闭线程:

export AIRFLOW__CORE__KILLED_TASK_CLEANUP_TIME=604800 成功了。

关于python - Airflow 信号 SIGTERM 意外地子进程,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54313601/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com