gpt4 book ai didi

scala - 无法使用 SSO 钱包将 Oracle 与 Apache Spark 连接

转载 作者:行者123 更新时间:2023-12-02 00:31:43 27 4
gpt4 key购买 nike

我们正在尝试使用我们端配置的 SSO 钱包和 Apache Spark 连接到作为 AmazonRDS 运行的远程 Oracle 数据库。我们可以使用 spark-shell 实用程序加载数据,如下所述

启动 Spark shell,并将 jdbc 和 oraclepki jar 添加到类路径

 spark-shell --driver-class-path /path/to/ojdbc8.jar:/path/to/oraclepki.jar

这是使用的 JDBC url:

 val JDBCURL="jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCPS)(HOST=www.example.aws.server.com)(PORT=1527))(CONNECT_DATA=(SID=XXX))(SECURITY = (SSL_SERVER_CERT_DN =\"C=US,ST=xxx,L=ZZZ,O=Amazon.com,OU=RDS,CN=www.xxx.aws.zzz.com\")))"

下面是 Spark jdbc 调用来加载数据

 spark.read.format("jdbc").option("url",JDBCURL)
.option("user","USER")
.option("oracle.net.tns_admin","/path/to/tnsnames.ora")
.option("oracle.net.wallet_location","(SOURCE=(METHOD=file)(METHOD_DATA=(DIRECTORY=/path/to/ssl_wallet/)))")
.option("password", "password")
.option("javax.net.ssl.trustStore","/path/to/cwallet.sso")
.option("javax.net.ssl.trustStoreType","SSO")
.option("dbtable",QUERY)
.option("driver", "oracle.jdbc.driver.OracleDriver").load

但是当我们尝试使用 spark-submit 命令运行它时,我们收到以下错误:

    Exception in thread "main" java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:774)
at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:688)
...
...
...

Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:523)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:521)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:660)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:286)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1438)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:518)
... 28 more
Caused by: oracle.net.ns.NetException: Unable to initialize ssl context.
at oracle.net.nt.CustomSSLSocketFactory.getSSLSocketEngine(CustomSSLSocketFactory.java:597)
at oracle.net.nt.TcpsNTAdapter.connect(TcpsNTAdapter.java:143)
at oracle.net.nt.ConnOption.connect(ConnOption.java:161)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:470)
... 33 more
Caused by: oracle.net.ns.NetException: Unable to initialize the key store.
at oracle.net.nt.CustomSSLSocketFactory.getKeyManagerArray(CustomSSLSocketFactory.java:642)
at oracle.net.nt.CustomSSLSocketFactory.getSSLSocketEngine(CustomSSLSocketFactory.java:580)
... 36 more
Caused by: java.security.KeyStoreException: SSO not found
at java.security.KeyStore.getInstance(KeyStore.java:851)
at oracle.net.nt.CustomSSLSocketFactory.getKeyManagerArray(CustomSSLSocketFactory.java:628)
... 37 more
Caused by: java.security.NoSuchAlgorithmException: SSO KeyStore not available
at sun.security.jca.GetInstance.getInstance(GetInstance.java:159)
at java.security.Security.getImpl(Security.java:695)
at java.security.KeyStore.getInstance(KeyStore.java:848)

我对 Spark 很陌生,可能在这里做错了什么。这就是我尝试配置配置的方式

    val conf = new SparkConf().setAppName(JOB_NAME)
conf.set("javax.net.ssl.trustStore", "/path/to/cwallet.sso");
conf.set("javax.net.ssl.trustStoreType", "SSO")
conf.set("oracle.net.tns_admin", "/path/to/tnsnames.ora")
conf.set("oracle.net.wallet_location", "(SOURCE=(METHOD=file)(METHOD_DATA=(DIRECTORY=/path/to/ssl_wallet/dir/)))")
conf.set("user", "user")
conf.set("password", "pass")

下面是使用的spark-submit命令

    spark-submit --class fully.qualified.path.to.main \
--jars /path/to/ojdbc8.jar,/path/to/oraclepki.jar,/path/to/osdt_cert.jar,/path/to/osdt_core.jar \
--deploy-mode client --files /path/to/hive-site.xml --master yarn \
--driver-memory 12G \
--conf "spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=/path/to/cwallet.sso -Djavax.net.ssl.trustStoreType=SSO" \
--executor-cores 4 --executor-memory 12G \
--num-executors 20 /path/to/application.jar /path/to/application_custom_config.conf

还尝试添加

--conf 'spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=/path/to/cwallet.sso -Djavax.net.ssl.trustStoreType=SSO'

--文件/path/to/cwallet.sso,/path/to/tnsnames.ora

spark-submit命令,但没有任何运气。我到底做错了什么?还尝试了 this post 中提到的解决方案但出现相同的错误。我是否需要确保每个执行器节点上都应该可以访问 trustStore ?如果是这种情况,那么为什么 spark-shell 命令工作正常?这是否意味着spark-cli不包含任何执行命令的工作节点?

请指教

更新:

It looks like you're using the JDBC driver from 12.1.0.2. Please upgrade to 18.3 which you can download from oracle.com/technetwork/database/application-development/jdbc/… Some changes have been made to make the use of wallets easier. -- @Jean de Lavarene

按照@Jean de Lavarene 的建议更改后,消除了最初的错误,但下面是我现在得到的内容

    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, example.server.net, executor 2): java.sql.SQLException: PKI classes not found. To use 'connect /' functionality, oraclepki.jar must be in the classpath: java.lang.NoClassDefFoundError: oracle/security/pki/OracleWallet
at oracle.jdbc.driver.PhysicalConnection.getSecretStoreCredentials(PhysicalConnection.java:3058)
at oracle.jdbc.driver.PhysicalConnection.parseUrl(PhysicalConnection.java:2823)

当我在 Spark 本地模式下运行它时:--master local[*] 它工作正常,但在 yarn 模式下失败。

我已经在使用 --jars 命令和逗号分隔的 jar 列表。我发现的是:

1) --jars 期望路径是本地路径,然后将它们复制到 HDFS 路径
2) 在开头使用 file:/// 不起作用
3) 如果我没有指定 --jars 参数,程序会要求缺少 JDBC 驱动程序类。一旦我使用 --jars 指定了 ojdbc8.jar ,错误就会消失并开始给出 oraclepki.jar 未找到错误。我不知道为什么会发生这种情况。
4)还尝试使用 : 作为分隔符,同时指定多个 jar,但没有任何运气

更新2

我能够使用

解决 oraclepki.jar not found 异常
    --driver-class-path /path/to/oraclepki.jar:/path/to/osdt_cert.jar:/path/to/others.jar 

但是一旦我们运行到 --master yarn 模式,就会显示以下异常

    Caused by: oracle.net.ns.NetException: Unable to initialize the key store.
at oracle.net.nt.CustomSSLSocketFactory.getKeyManagerArray(CustomSSLSocketFactory.java:617)
at oracle.net.nt.CustomSSLSocketFactory.createSSLContext(CustomSSLSocketFactory.java:322)
... 32 more
Caused by: java.io.FileNotFoundException: /path/to/cwallet.sso (No such file or directory)

根据我的理解,当它从工作节点启 Action 业时,cwallet.sso 文件路径在这些节点上不可用。我们尝试为钱包指定 HDFS 路径,但该实用程序希望在创建钱包时提供本地路径。

那么我们需要手动将钱包文件复制到所有工作节点吗?或者有没有更好的替代方案来实现这一目标?

请指教

最佳答案

基本上这就是我们解决这个问题的方法。这里要记住的一件重要事情是 SSO 文件必须存在于 Spark 将运行的所有节点(spark 的执行器节点)

    val SOURCE_DF = spark.read.format("jdbc")
.option("url", "jdbc:oracle:thin:@...full string here")
.option("oracle.net.wallet_location", "(SOURCE=(METHOD=file)(METHOD_DATA=(DIRECTORY=/path/to/sso/dir)))")
...
...

如果您需要传递其他详细信息,您可以添加更多 .options 参数

   .option("oracle.net.tns_admin", "oracle/tns/file/path"))
.option("javax.net.ssl.trustStoreType", "sso")

关于scala - 无法使用 SSO 钱包将 Oracle 与 Apache Spark 连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53616496/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com