gpt4 book ai didi

apache-spark - 无法在python的 yarn 簇模式下读取jceks文件

转载 作者:行者123 更新时间:2023-12-02 19:14:28 24 4
gpt4 key购买 nike

我正在使用jceks文件解密密码,并且无法在 yarn 簇模式下读取加密的密码

我尝试了包括在内的不同方法

spark-submit --deploy-mode cluster 
--file /localpath/credentials.jceks#credentials.jceks
--conf spark.hadoop.hadoop.security.credential.provider.path=jceks://file////localpath/credentials.jceks test.py
spark1 = SparkSession.builder.appName("xyz").master("yarn").enableHiveSupport().config("hive.exec.dynamic.partition", "true").config("hive.exec.dynamic.partition.mode", "nonstrict").getOrCreate()
x = spark1.sparkContext._jsc.hadoopConfiguration()
x.set("hadoop.security.credential.provider.path", "jceks://file///credentials.jceks")
a = x.getPassword("<password alias>")
passw = ""
for i in range(a.__len__()):
passw = passw + str(a.__getitem__(i))

我收到以下错误:

attributeError: 'NoneType' object has no attribute 'len'



当我打印一个时,它没有

最佳答案

FWIW,如果您尝试将jceks文件放入hdfs,则 yarn 工作人员将能够在群集模式下运行时找到它,至少它对我有用。希望对你有效。

hadoop fs -put ~/.jceks /user/<uid>/.jceks
spark1 = SparkSession.builder.appName("xyz").master("yarn").enableHiveSupport().config("hive.exec.dynamic.partition", "true").config("hive.exec.dynamic.partition.mode", "nonstrict").getOrCreate()
x = spark1.sparkContext._jsc.hadoopConfiguration()
jceks_hdfs_path = "jceks://hdfs@<host>/user/<uid>/.jceks"
x.set("hadoop.security.credential.provider.path", jceks_hdfs_path)
a = x.getPassword("<password alias>")
passw = ""
for i in range(a.__len__()):
passw = passw + str(a.__getitem__(i))

这样,您在运行spark-submit时就不需要在参数中指定--files和--conf。希望能帮助到你。

关于apache-spark - 无法在python的 yarn 簇模式下读取jceks文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57656034/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com