gpt4 book ai didi

logging - Kubernetes 上的 Airflow 无法获取日志

转载 作者:行者123 更新时间:2023-12-02 12:27:50 26 4
gpt4 key购买 nike

我的 Airflow 服务作为 kubernetes 部署运行,有两个容器,一个用于 webserver一个用于 scheduler .
我正在使用 KubernetesPodOperator 运行任务,使用 in_cluster=True参数,而且运行良好,我什至可以 kubectl logs pod-name并显示所有日志。

然而,airflow-webserver无法获取日志:

*** Log file does not exist: /tmp/logs/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log
*** Fetching from: http://pod-name-7dffbdf877-6mhrn:8793/log/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='pod-name-7dffbdf877-6mhrn', port=8793): Max retries exceeded with url: /log/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fef6e00df10>: Failed to establish a new connection: [Errno 111] Connection refused'))

Pod 似乎无法连接到端口 8793 上的 Airflow 日志服务。如果我 kubectl exec bash进入容器,我可以在端口 8080 上 curl localhost,但不能在 80 和 8793 上。

Kubernetes 部署:
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: pod-name
namespace: airflow
spec:
replicas: 1
selector:
matchLabels:
app: pod-name
template:
metadata:
labels:
app: pod-name
spec:
restartPolicy: Always
volumes:
- name: airflow-cfg
configMap:
name: airflow.cfg
- name: dags
emptyDir: {}
containers:
- name: airflow-scheduler
args:
- airflow
- scheduler
image: registry.personal.io:5000/image/path
imagePullPolicy: Always
volumeMounts:
- name: dags
mountPath: /airflow_dags
- name: airflow-cfg
mountPath: /home/airflow/airflow.cfg
subPath: airflow.cfg
env:
- name: EXECUTOR
value: Local
- name: LOAD_EX
value: "n"
- name: FORWARDED_ALLOW_IPS
value: "*"
ports:
- containerPort: 8793
- containerPort: 8080
- name: airflow-webserver
args:
- airflow
- webserver
- --pid
- /tmp/airflow-webserver.pid
image: registry.personal.io:5000/image/path
imagePullPolicy: Always
volumeMounts:
- name: dags
mountPath: /airflow_dags
- name: airflow-cfg
mountPath: /home/airflow/airflow.cfg
subPath: airflow.cfg
ports:
- containerPort: 8793
- containerPort: 8080
env:
- name: EXECUTOR
value: Local
- name: LOAD_EX
value: "n"
- name: FORWARDED_ALLOW_IPS
value: "*"

注意:如果 Airflow 在开发环境(本地而不是 kubernetes)中运行,则一切正常。

最佳答案

任务完成后,Airflow 会删除 pod,是不是 pod 只是丢失了,因此无法访问其日志?

尝试设置看看是否是这种情况AIRFLOW__KUBERNETES__DELETE_WORKER_PODS=False
在 Kubernetes 上运行 Airflow 时,我建议使用远程日志记录(例如 s3),这样在删除 pod 时会保留日志。

关于logging - Kubernetes 上的 Airflow 无法获取日志,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61902573/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com