gpt4 book ai didi

python - 无法从云存储更改时触发的云功能触发 Composer /气流 dag

转载 作者:行者123 更新时间:2023-12-04 14:10:30 25 4
gpt4 key购买 nike

我在 google-cloud-composer 上创建并运行了 dags环境 ( dlkpipelinesv1 : composer-1.13.0-airflow-1.10.12)。我可以手动触发这些 dag,并使用调度程序,但是在通过 cloud-functions 触发它们时我被卡住了检测 google-cloud-storage 中的变化桶。
请注意,我有另一个 GC-Composer 环境( 管道 :composer-1.7.5-airflow-1.10.2)使用相同的谷歌云函数来触发相关的 dag,以及 它正在工作 .
我关注了 this guide创建触发 dag 的函数。所以我检索了以下变量:

PROJECT_ID = <project_id>
CLIENT_ID = <client_id_retrieved_by_running_the_code_in_the_guide_within_my_gcp_console>
WEBSERVER_ID = <airflow_webserver_id>
DAG_NAME = <dag_to_trigger>
WEBSERVER_URL = f"https://{WEBSERVER_ID}.appspot.com/api/experimental/dags/{DAG_NAME}/dag_runs"


def file_listener(event, context):
"""Entry point of the cloud function: Triggered by a change to a Cloud Storage bucket.
Args:
event (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
logging.info("Running the file listener process")
logging.info(f"event : {event}")
logging.info(f"context : {context}")
file = event
if file["size"] == "0" or "DTM_DATALAKE_AUDIT_COMPTAGE" not in file["name"] or ".filepart" in file["name"].lower():
logging.info("no matching file")
exit(0)

logging.info(f"File listener detected the presence of : {file['name']}.")

# id_token = authorize_iap()
# make_iap_request({"file_name": file["name"]}, id_token)
make_iap_request(url=WEBSERVER_URL, client_id=CLIENT_ID, method="POST")


def make_iap_request(url, client_id, method="GET", **kwargs):
"""Makes a request to an application protected by Identity-Aware Proxy.

Args:
url: The Identity-Aware Proxy-protected URL to fetch.
client_id: The client ID used by Identity-Aware Proxy.
method: The request method to use
('GET', 'OPTIONS', 'HEAD', 'POST', 'PUT', 'PATCH', 'DELETE')
**kwargs: Any of the parameters defined for the request function:
https://github.com/requests/requests/blob/master/requests/api.py
If no timeout is provided, it is set to 90 by default.

Returns:
The page body, or raises an exception if the page couldn't be retrieved.
"""
# Set the default timeout, if missing
if "timeout" not in kwargs:
kwargs["timeout"] = 90

# Obtain an OpenID Connect (OIDC) token from metadata server or using service account.
open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
logging.info(f"Retrieved open id connect (bearer) token {open_id_connect_token}")

# Fetch the Identity-Aware Proxy-protected URL, including an authorization header containing "Bearer " followed by a
# Google-issued OpenID Connect token for the service account.
resp = requests.request(method, url, headers={"Authorization": f"Bearer {open_id_connect_token}"}, **kwargs)

if resp.status_code == 403:
raise Exception("Service account does not have permission to access the IAP-protected application.")
elif resp.status_code != 200:
raise Exception(f"Bad response from application: {resp.status_code} / {resp.headers} / {resp.text}")
else:
logging.info(f"Response status - {resp.status_code}")
return resp.json
这是在 GC 函数中运行的代码
我已经在 中查看了环境详细信息dlkpipelinesv1 管道 分别使用此代码:
credentials, _ = google.auth.default(
scopes=['https://www.googleapis.com/auth/cloud-platform'])
authed_session = google.auth.transport.requests.AuthorizedSession(
credentials)

# project_id = 'YOUR_PROJECT_ID'
# location = 'us-central1'
# composer_environment = 'YOUR_COMPOSER_ENVIRONMENT_NAME'

environment_url = (
'https://composer.googleapis.com/v1beta1/projects/{}/locations/{}'
'/environments/{}').format(project_id, location, composer_environment)
composer_response = authed_session.request('GET', environment_url)
environment_data = composer_response.json()
并且两者使用相同的服务帐户运行,即相同的 IAM 角色。虽然我注意到以下不同的细节:
在旧环境中:
"airflowUri": "https://p5<hidden_value>-tp.appspot.com",
"privateEnvironmentConfig": { "privateClusterConfig": {} },
在新环境中:
"airflowUri": "https://da<hidden_value>-tp.appspot.com",
"privateEnvironmentConfig": {
"privateClusterConfig": {},
"webServerIpv4CidrBlock": "<hidden_value>",
"cloudSqlIpv4CidrBlock": "<hidden_value>"
}
我用于发出发布请求的服务帐户具有以下角色:
Cloud Functions Service Agent 
Composer Administrator
Composer User
Service Account Token Creator
Service Account User
运行我的 Composer 环境的服务帐户具有以下角色:
BigQuery Admin
Composer Worker
Service Account Token Creator
Storage Object Admin
但我仍然收到 403 - ForbiddenLog Explorerpost向气流 API 发出请求。
编辑 2020-11-16 :
我已经更新到最新版 make_iap_request代码。
我在安全服务中修改了 IAP,但找不到接受 HTTP: post 的网络服务器来自我的云功能的请求...请参阅下面的图片,无论如何我将服务帐户添加到默认和 CRM IAP 资源中以防万一,但我仍然收到此错误:
Exception: Service account does not have permission to access the IAP-protected application.
主要问题是:这里有什么 IAP 处于危险之中??以及如何将我的服务帐户添加为此 IAP 的用户。
我错过了什么?
List of HTTP IAP

最佳答案

有一个配置参数会导致对 API 的所有请求都被拒绝...
documentation ,提到我们需要覆盖以下气流配置:

[api]
auth_backend = airflow.api.auth.backend.deny_all
进入
[api]
auth_backend = airflow.api.auth.backend.default
这个细节真的很重要,谷歌的文档中没有提到......
有用的链接:
  • Triggering DAGS (workflows) with GCS
  • make_iap_request.py repository
  • 关于python - 无法从云存储更改时触发的云功能触发 Composer /气流 dag,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64847488/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com