gpt4 book ai didi

apache-spark - pyWriteDynamicFrame : Unrecognized scheme null; expected s3, s3n 或 s3a [粘合到 Redshift]

转载 作者:行者123 更新时间:2023-12-02 19:49:06 27 4
gpt4 key购买 nike

在执行 Glue 作业时,在进行必要的转换后,我将 Spark df 的结果写入 Redshift 表,如下所示:

dynamic_df = DynamicFrame.fromDF(df, glue_context, "dynamic_df")

glue_context.write_dynamic_frame.from_jdbc_conf(
frame=dynamic_df, catalog_connection=args['catalog_connection'],
connection_options={"dbtable": args['dbschema'] + "." + args['dbtable'], "database": args['database']},
transformation_ctx="write_my_df")

但是我收到了这个异常:

19/08/23 14:29:31 ERROR __main__: Traceback (most recent call last):
File "/mnt/yarn/usercache/root/appcache/application_1572375324962_0001/container_1572375324962_0001_01_000001/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/mnt/yarn/usercache/root/appcache/application_1572375324962_0001/container_1572375324962_0001_01_000001/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o190.pyWriteDynamicFrame.
: java.lang.IllegalArgumentException: Unrecognized scheme null; expected s3, s3n, or s3a

我做错了什么?怎么解决呢?

最佳答案

我在函数 from_jdbc_conf 中缺少参数 redshift_tmp_dir,如 documentation 中报告的那样.

所以现在的功能是:

glue_context.write_dynamic_frame.from_jdbc_conf(
frame=dynamic_df, catalog_connection=args['catalog_connection'],
connection_options={"dbtable": args['dbschema'] + "." + args['dbtable'], "database": args['database']},
redshift_tmp_dir="s3://my_bucket/my/location/", transformation_ctx="write_my_df")

关于apache-spark - pyWriteDynamicFrame : Unrecognized scheme null; expected s3, s3n 或 s3a [粘合到 Redshift],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58623478/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com