gpt4 book ai didi

jupyter-notebook - 如何将参数传递给 PapermillOperator 以在 Airflow 上运行作业?

转载 作者:行者123 更新时间:2023-12-04 16:03:38 24 4
gpt4 key购买 nike

使用 PapermillOperator 运行 Airflow 作业时,dag 执行失败。

我在将参数传递给 PapermillOperator 时遇到问题。

我打开 papermill_operator.py (packages/airflow/operators/papermill_operator.py) 并硬编码一行来指定 papameters

def execute(self, context):
for i in range(len(self.inlets)):
pm.execute_notebook(self.inlets[i].location,
self.outlets[i].location,
parameters = dict(msgs="hello")
progress_bar=False, report_mode=True)

然后它的工作

而原始代码是
def execute(self, context):
for i in range(len(self.inlets)):
pm.execute_notebook(self.inlets[i].location,
self.outlets[i].location,
parameters=self.inlets[i].parameters,
progress_bar=False, report_mode=True)

尝试了另一种解决方案
https://github.com/nteract/papermill/issues/324#issuecomment-472446375
它工作正常

我的 DAG 代码是
import airflow

from airflow.models import DAG
from airflow.operators.papermill_operator import PapermillOperator

from datetime import timedelta

args = {
'owner': 'Airflow',
'start_date': airflow.utils.dates.days_ago(2),

}

dag = DAG(
dag_id='9', default_args=args,
schedule_interval='@once',
dagrun_timeout=timedelta(minutes=10))

run_this = PapermillOperator(
task_id="1",
dag=dag,
input_nb="/home/exa00112/abc.ipynb",
output_nb="/home/exa00112/umesh.ipynb",
parameters = dict("msgs" = "hello")
)

run_this

[2019-09-10 20:36:48,806] {logging_mixin.py:95} INFO - [2019-09-10 > > > 20:36:48,806] {datasets.py:62} INFO - parameters [2019-09-10 20:36:48,806] {init.py:1580} ERROR - Can't compile non > template nodes Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/airflow/models/init.py", line 1441, in _run_raw_task result = task_copy.execute(context=context) File "/usr/local/lib/python3.5/dist-packages/airflow/operators/papermill_operator.py", line 63, in execute parameters=self.inlets[i].parameters, File "/usr/local/lib/python3.5/dist-packages/airflow/lineage/datasets.py", line 66, in getattr return env.from_string(self._data.get(attr)).render(**self.context) File "/home/exa00112/.local/lib/python3.5/site-packages/jinja2/environment.py", line 880, in from_string return cls.from_code(self, self.compile(source), globals, None) File "/home/exa00112/.local/lib/python3.5/site-packages/jinja2/environment.py", line 581, in compile defer_init=defer_init) File "/home/exa00112/.local/lib/python3.5/site-packages/jinja2/environment.py", line 543, in _generate optimized=self.optimized) File "/home/exa00112/.local/lib/python3.5/site-packages/jinja2/compiler.py", line 78, in generate raise TypeError('Can\'t compile non template nodes') TypeError: Can't compile non template nodes [2019-09-10 20:36:48,808] {init.py:1611} INFO - Marking task as FAILED.

最佳答案

似乎是 Papermill Operator 将无效参数数据结构传递给 Papermill(当 papermill 查找 dict 时传递字符串参数 dict)到

https://issues.apache.org/jira/browse/AIRFLOW-5774

不确定何时修复,因为它看起来像是一个重复的问题,很难追踪

关于jupyter-notebook - 如何将参数传递给 PapermillOperator 以在 Airflow 上运行作业?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57882223/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com