gpt4 book ai didi

amazon-web-services - Apache Airflow S3ListOperator 未列出文件

转载 作者:行者123 更新时间:2023-12-05 01:07:34 25 4
gpt4 key购买 nike

我正在尝试使用 airflow.providers.amazon.aws.operators.s3_list S3ListOperator 使用以下 DAG 运算符列出我的 AWS 账户中 S3 存储桶中的文件:

list_bucket = S3ListOperator(
task_id = 'list_files_in_bucket',
bucket = '<MY_BUCKET>',
aws_conn_id = 's3_default'
)

我已经配置了我的Extra形式为:{"aws_access_key_id": "<MY_ACCESS_KEY>", "aws_secret_access_key": "<MY_SECRET_KEY>"} 的连接详细信息

当我运行我的 Airflow 作业时,它显示它执行良好,我的任务状态是 成功。这是日志输出:

[2021-04-27 11:44:50,009] {base_aws.py:368} INFO - Airflow Connection: aws_conn_id=s3_default
[2021-04-27 11:44:50,013] {base_aws.py:170} INFO - Credentials retrieved from extra_config
[2021-04-27 11:44:50,013] {base_aws.py:84} INFO - Creating session with aws_access_key_id=<MY_ACCESS_KEY> region_name=None
[2021-04-27 11:44:50,027] {base_aws.py:157} INFO - role_arn is None
[2021-04-27 11:44:50,661] {taskinstance.py:1185} INFO - Marking task as SUCCESS. dag_id=two_step, task_id=list_files_in_bucket, execution_date=20210427T184422, start_date=20210427T184439, end_date=20210427T184450
[2021-04-27 11:44:50,676] {taskinstance.py:1246} INFO - 0 downstream tasks scheduled from follow-on schedule check
[2021-04-27 11:44:50,700] {local_task_job.py:146} INFO - Task exited with return code 0

我可以做些什么来将存储桶中的文件打印到日志?TIA

最佳答案

这段代码就足够了,你不需要使用打印功能。只需要查看对应的log,然后去xcom,返回列表就在那里。

list_bucket = S3ListOperator(
task_id='list_files_in_bucket',
bucket='ob-air-pre',
prefix='data/',
delimiter='/',
aws_conn_id='aws'
)

enter image description here

关于amazon-web-services - Apache Airflow S3ListOperator 未列出文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67289076/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com