gpt4 book ai didi

google-cloud-dataproc - 无法参数化 placement.managedCluster.config 下的任何值

转载 作者:行者123 更新时间:2023-12-05 04:54:46 24 4
gpt4 key购买 nike

我的目标是从 python 代码创建 dataproc 工作流模板。同时,我希望能够在模板实例化期间参数化 placement.managedCluster.config.gceClusterConfig.subnetworkUri 字段。

我从 json 文件中读取模板,例如:

{
"id": "bigquery-extractor",
"placement": {
"managed_cluster": {
"config": {
"gce_cluster_config": {
"subnetwork_uri": "some-subnet-name"
},
"software_config" : {
"image_version": "1.5"
}
},
"cluster_name": "some-name"
}
},
"jobs": [
{
"pyspark_job": {
"args": [
"job_argument"
],
"main_python_file_uri": "gs:///path-to-file"
},
"step_id": "extract"
}
],
"parameters": [
{
"name": "CLUSTER_NAME",
"fields": [
"placement.managedCluster.clusterName"
]
},
{
"name": "SUBNETWORK_URI",
"fields": [
"placement.managedCluster.config.gceClusterConfig.subnetworkUri"
]
},
{
"name": "MAIN_PY_FILE",
"fields": [
"jobs['extract'].pysparkJob.mainPythonFileUri"
]
},
{
"name": "JOB_ARGUMENT",
"fields": [
"jobs['extract'].pysparkJob.args[0]"
]
}
]
}

我使用的代码片段:

        options = ClientOptions(api_endpoint="{}-dataproc.googleapis.com:443".format(region))
client = dataproc.WorkflowTemplateServiceClient(client_options=options)
template_file = open(path_to_file, "r")
template_dict = eval(template_file.read())
print(template_dict)

template = dataproc.WorkflowTemplate(template_dict)

full_region_id = "projects/{project_id}/regions/{region}".format(project_id=project_id, region=region)
try:
client.create_workflow_template(
parent=full_region_id,
template=template
)
except AlreadyExists as err:
print(err)
pass

当我尝试运行此代码时,出现以下错误:

google.api_core.exceptions.InvalidArgument: 400 Invalid field path placement.managed_cluster.configuration.gce_cluster_config.subnetwork_uri: Field gce_cluster_config does not exist.

如果我尝试参数化 placement.managedCluster.config.softwareConfig.imageVersion,此行为也是相同的,我会得到

google.api_core.exceptions.InvalidArgument: 400 Invalid field path placement.managed_cluster.configuration.software_config.image_version: Field software_config does not exist.

但是,如果我从 parameters 映射中排除 placement.managedCluster.config 下的任何字段,则模板创建成功。

我没有发现对这些字段进行参数化有任何限制。有没有?还是只是我做错了什么?

最佳答案

doc列出了可参数化的字段。似乎只有 managedClustermanagedCluster.name 是可参数化的:

Managed cluster name. Dataproc will use the user-supplied name as the name prefix, and append random characters to create a unique cluster name. The cluster is deleted at the end of the workflow.

我没有看到可参数化的 managedCluster.config

关于google-cloud-dataproc - 无法参数化 placement.managedCluster.config 下的任何值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65642068/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com