gpt4 book ai didi

python - sqlalchemy中的连接池是线程安全的吗?

转载 作者:行者123 更新时间:2023-12-02 22:50:16 28 4
gpt4 key购买 nike

文档说连接池也不是为多线程设计的:

It’s critical that when using a connection pool, and by extension when using an Engine created via create_engine(), that the pooled connections are not shared to a forked process. TCP connections are represented as file descriptors, which usually work across process boundaries, meaning this will cause concurrent access to the file descriptor on behalf of two or more entirely independent Python interpreter states.

据我了解,如果我创建连接池:

self.engine = create_engine('postgresql://{user}:{password}@{host}:{port}/{db}'.format(
user=Configuration().get(section='repository', option='user'),
password=Configuration().get(section='repository', option='password'),
host=Configuration().get(section='repository', option='host'),
port=Configuration().get(section='repository', option='port'),
db=Configuration().get(section='repository', option='database')
), echo=False, pool_size=3)

self.session = sessionmaker(self.engine, expire_on_commit=False)

然后在不同的线程中调用self.session(),我将有3个不同的连接,它们在N个不同的线程中使用。这是否意味着只有 3 个并发线程会执行某些工作,而其他线程将等待一个或多个线程调用 session.close()?或者有可能 >2 个线程同时使用同一个连接?

NullPool 是否更安全(因为每个新 session 都是一个新连接)?

self.engine = create_engine('postgresql://{user}:{password}@{host}:{port}/{db}'.format(
user=Configuration().get(section='repository', option='user'),
password=Configuration().get(section='repository', option='password'),
host=Configuration().get(section='repository', option='host'),
port=Configuration().get(section='repository', option='port'),
db=Configuration().get(section='repository', option='database')
), echo=False, poolclass=NullPool)

一般问题:在这种情况下使用相同的连接池是否可以:

engine = create_engine('connection_string', echo=False, pool_size=3)
Session = sessionmaker(engine)

def some_function():
session = Session()
...

pool = Pool(processes=10)
pool.map(some_function)
pool.close()
pool.join()

最佳答案

总而言之,线程和进程之间似乎是混合的。该问题首先询问 SQLAlchemy 连接池是否是线程安全的,但以使用 multiprocessing 的代码示例结束。 。对“一般问题”的简短回答是:不,如果使用 fork ,则不应在进程边界上共享引擎及其关联的连接池。但也有异常(exception)。

池实现本身是线程安全的,并且通过代理 Engine is thread-safe as well ,因为引擎除了保留对池的引用之外,不保留状态。另一方面,从池中 checkout 的连接是 not thread-safe ,和neither is a Session .

Documentation says that connection pool also is not designed for multithreading:

有一点误读,因为文档中的原始引用是关于在使用 fork 的情况下在进程边界上共享连接池。这可能会导致麻烦,因为在 SQLAlchemy 和 DB-API 层下面通常有一个 TCP/IP 套接字或文件句柄,而这些不应该同时操作。

在这种特殊情况下,使用 NullPool 是安全的,而其他的则不然,因为它根本不池化,因此连接不会在进程之间共享,除非一个进程超出了它们的范围。这样做的方法。

Does it mean that only 3 concurrent thread will do some work while others will wait until one or more thread will call session.close()?

假设 QueuePool在使用中,设置的大小不是硬性限制,并且有一定的溢出空间。大小决定了池中持久保留的连接数。如果达到溢出限制,调用将等待 timeout放弃并提出 TimeoutError 之前的秒数,如果没有可用的连接。

Or there is a chance that >2 threads will use the same connection simultaneously?

两个或更多线程将无法意外地从池中 check out 同一连接,StaticPool 除外。 ,但可以在之后在线程之间显式共享它(不要)。

<小时/>

最后,"Working with Engines and Connections - Basic Usage"涵盖了问题的主要部分:

A single Engine manages many individual DBAPI connections on behalf of the process and is intended to be called upon in a concurrent fashion [emphasis added].

...

For a multiple-process application that uses the os.fork system call, or for example the Python multiprocessing module, it’s usually required that a separate Engine be used for each child process. This is because the Engine maintains a reference to a connection pool that ultimately references DBAPI connections - these tend to not be portable across process boundaries. An Engine that is configured not to use pooling (which is achieved via the usage of NullPool) does not have this requirement.

关于python - sqlalchemy中的连接池是线程安全的吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51769299/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com