gpt4 book ai didi

python - 将 Glue Connection 资源的值传递给 Python Job

转载 作者:行者123 更新时间:2023-12-03 07:39:52 26 4
gpt4 key购买 nike

在我的 AWS::Glue::Connection 资源中,我已设置了访问我的 SQL Server 数据库所需的所有凭据。

  GlueJDBCConnection:
Type: AWS::Glue::Connection
Properties:
CatalogId: !Ref AWS::AccountId
ConnectionInput:
ConnectionType: "JDBC"
ConnectionProperties:
USERNAME: !Ref Username
PASSWORD: !Ref Password
JDBC_CONNECTION_URL: !Ref GlueJDBCStringTarget
sslMode: 'REQUIRED'
PhysicalConnectionRequirements:
AvailabilityZone: !If [IsProd, !Ref AvailabilityZoneProd, !Ref AvailabilityZoneNonProd]
SecurityGroupIdList:
- Fn::GetAtt: GlueJobSecurityGroup.GroupId
SubnetId: !If [IsProd, !Ref PrivateSubnetAz2, !Ref PrivateSubnetAz3]
Name: !Ref JDBCConnectionName

我需要在 Python 脚本中使用 USERNAMEPASSWORD,但我不希望它们在 AWS 的 作业参数 部分中公开' 安慰。是否可以通过其他方式完成我在下面所做的事情?

  GlueJob:
Type: AWS::Glue::Job
DependsOn: GlueSecurityConfiguration
Properties:
Name: !Ref GlueJobName
Role: !Ref RoleForRTMI
SecurityConfiguration: !Ref SecurityConfiguration
Command:
Name: glueetl
PythonVersion: 3
ScriptLocation: !Sub 's3://xyz-${AWS::AccountId}-xx-xxxx-0/${blablabla}'
DefaultArguments:
'--USER': !Ref Username
'--PASS': !Ref Password
Connections:
Connections:
- Ref: GlueJDBCConnection
ExecutionProperty:
MaxConcurrentRuns: 2
#MaxCapacity: 2 #if used, don't use WorkerType and NumberOfWorkers
WorkerType: G.1X
NumberOfWorkers: 2
MaxRetries: 1
GlueVersion: '2.0'
Tags:
name: value_1

Python 示例:

class FrameWriter:

def __init__(self, environment: str, context: GlueContext):
self.environment = environment
self.context = context

def write_frame(self, table_name: str, spark_df: DataFrame, rds_user: str, rds_pass: str):

rds_creds = glue_rds_cred(self.environment)
rds_user = rds_user
rds_pass = rds_pass
rds_url = dict_recursive_lookup("JDBC_CONNECTION_URL", rds_creds)

glue_df = DynamicFrame.fromDF(spark_df, self.context, "glue_df")
glue_table = table_name
self.context.write_dynamic_frame.from_options(
frame=glue_df,
connection_type = 'sqlserver',
connection_options = {"url": f"{rds_url}/db_name", "user": f"{rds_user}", "password": f"{rds_pass}", "dbtable": f"rdm.{glue_table}"},
transformation_ctx="output",
)

writer = FrameWriter(environment, glue_context)
writer.write_frame(name, sp_df, args["USER"], args["PASS"])

最佳答案

我想出了下面的代码,使用 boto3 提取用户并传递,这样我就不会在 AWS 的 Glue 控制台中公开它:

import boto3

def glue_rds_cred(environment) -> dict:
client_glue = boto3.client("glue")
response_rds_pass = client_glue.get_connection(
# CatalogId='string',
Name=f"instance_name-{environment}",
HidePassword=False,
)
return response_rds_pass


def dict_recursive_lookup(k: str, d: dict) -> str:
if k in d:
return d[k]
for v in d.values():
if isinstance(v, dict):
a = dict_recursive_lookup(k, v)
if a is not None:
return a
return None

关于python - 将 Glue Connection 资源的值传递给 Python Job,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72005392/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com