I am creating a PySpark 3.4.1 application for development in docker with Python 3.11.5, It should be able to connect to multiple types of databases throught JDBC conections. I am testing the connection with a local Oracle DB that I set up using docker in another container. However, I am getting the following error:
我正在创建一个PySpark 3.4.1应用程序在与Python3.11.5对接开发,它应该能够连接到多种类型的数据库通过JDBC连接。我正在测试与本地Oracle数据库的连接,该数据库是我在另一个容器中使用docker设置的。但是,我收到以下错误:
Py4JJavaError: An error occurred while calling o316.load.
: java.sql.SQLException: ORA-12541: Cannot connect. No listener at host 127.0.0.1 port 1521. (CONNECTION_ID=ZHRvV1iVRICqrdGeoIq7BQ==)
When I run (using jar: "ojdbc11.jar"):
当我运行时(使用JAR:“ojdbc11.jar”):
connection_opts = {
"driver": "oracle.jdbc.driver.OracleDriver",
"url": "jdbc:oracle:thin:@127.0.0.1:1521/FREEPDB1",
"dbtable": "select * from xtable",
"user": "my_db_admin",
"password": "20pwd23",
}
df = spark.read.format("jdbc").options(**connection_opts).load()
The docker-compose.yml file is as follows:
Docker-compose.yml文件如下所示:
version: "3.3"
services:
spark-master:
image: my_pyspark_image:latest
tty: true
stdin_open: true
ports:
- "9090:8080"
- "7077:7077"
oracle-localdb:
# Creation reference: https://hub.docker.com/r/gvenzl/oracle-free
image: gvenzl/oracle-free:slim
shm_size: 1g
ports:
- '1521:1521'
environment:
ORACLE_RANDOM_PASSWORD: true
APP_USER: my_db_admin
APP_USER_PASSWORD: 20pwd23
volumes:
- type: volume
source: pyspark_oracle-volume
target: /opt/oracle/oradata
volumes:
pyspark_oracle-volume:
# external: true
When I run lsnrctl status
:
当我运行lsnrctl状态时:
LSNRCTL for Linux: Version 23.0.0.0.0 - Developer-Release on 09-SEP-2023 18:49:38
Copyright (c) 1991, 2023, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC_FOR_FREE)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 23.0.0.0.0 - Developer-Release
Start Date 09-SEP-2023 18:47:38
Uptime 0 days 0 hr. 1 min. 59 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Default Service FREE
Listener Parameter File /opt/oracle/product/23c/dbhomeFree/network/admin/listener.ora
Listener Log File /opt/oracle/diag/tnslsnr/6b3c2441425c/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC_FOR_FREE)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=0.0.0.0)(PORT=1521)))
Services Summary...
Service "FREE" has 1 instance(s).
Instance "FREE", status READY, has 1 handler(s) for this service...
Service "FREEXDB" has 1 instance(s).
Instance "FREE", status READY, has 0 handler(s) for this service...
Service "fb99f7d127aa0bafe0536402000a43b5" has 1 instance(s).
Instance "FREE", status READY, has 1 handler(s) for this service...
Service "freepdb1" has 1 instance(s).
Instance "FREE", status READY, has 1 handler(s) for this service...
The command completed successfully
The listener.ora:
The listener.ora:
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC_FOR_FREE))
(ADDRESS = (PROTOCOL = TCP)(HOST = 0.0.0.0)(PORT = 1521))
)
)
DEFAULT_SERVICE_LISTENER = FREE
The tnsnames.ora:
Tnsnames.ora:
FREE =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 0.0.0.0)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = FREE)
)
)
FREEPDB1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 0.0.0.0)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = FREEPDB1)
)
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC_FOR_FREE))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
I already tried the same connection with other custom created databases and with sys users. Also, I tried creating the container with multiple ports but no luck. Finally, I do not know if this will help but, I tried this and no luck either.
我已经对其他定制创建的数据库和sys用户尝试了相同的连接。此外,我尝试创建具有多个端口的容器,但没有成功。最后,我不知道这是否会有帮助,但是,我尝试了这个,也没有成功。
I tried making the connection with the oracledb python library:
我尝试与oracledb python库建立连接:
import oracledb
connection = oracledb.connect(user="my_db_admin", password='20pwd23',
host="127.0.0.1", port=1521, service_name="freepdb1")
But I get this:
但我明白:
OperationalError: DPY-6005: cannot connect to database (CONNECTION_ID=M88go017TW5iuuV35FqWhw==).
[Errno 111] Connection refused
Could somebody explain me what I'm missing, how to solve it and if possible how to better understand the .ora files
谁能给我解释一下我遗漏了什么,如何解决它,如果可能的话,如何更好地理解.ora文件
更多回答
I want to start with a question:
我想从一个问题开始:
Are you running oracle database in your 'spark-master' container?
你在你的“星火大师”容器中运行甲骨文数据库吗?
Because, that is how you have configured your database connection by specifying localhost/127.0.0.1
因为,这就是通过指定localhost/127.0.0.1来配置数据库连接的方式
localhost
is local to the container. There is no database running in your 'spark-master' container. It is running as a service/container alongside your 'spark-master' container. When utilizing docker compose, service discovery is happening behind the scenes and you can access them by name.
To contact the oracle database service, you provide the name oracle-localdb
localhost对于容器是本地的。您的'spark-master'容器中没有运行数据库。它作为一个服务/容器与你的“spark-master”容器一起运行。当使用Docker compose时,服务发现是在后台进行的,您可以通过名称访问它们。要联系Oracle数据库服务,请提供名称oracle-localdb
Now, this should work for you:
现在,这对你来说应该是有效的:
import oracledb
connection = oracledb.connect(user="my_db_admin", password='20pwd23',
host="oracle-localdb", port=1521, service_name="freepdb1")
Best of luck!
祝你好运!
更多回答
That is quite the revelation for me, I will have to review my understanding of docker. Thank you so much.
这对我来说是一个很大的启示,我将不得不重新审视我对多克的理解。非常感谢。
Note that the answer uses Oracle's Python DB API compliant python-oracledb driver which has a different API to JDBC and the connection string may be slightly different. But the answer's major point about using the correct hostname is still true.
请注意,答案是使用与Oracle的Python DB API兼容的python-oracledb驱动程序,该驱动程序具有与JDBC不同的API,并且连接字符串可能略有不同。但是,关于使用正确的主机名的主要问题仍然是正确的。
我是一名优秀的程序员,十分优秀!