gpt4 book ai didi

python - pymongo MongoClient 不能在多进程中工作?

转载 作者:IT老高 更新时间:2023-10-28 13:30:30 42 4
gpt4 key购买 nike

我正在使用 pymongo 3.2,我想在 multiporcess 中使用它:

client = MongoClient(JD_SEARCH_MONGO_URI, connect=False)
db = client.jd_search

with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
for jd in db['sample_data'].find():
jdId = jd["jdId"]
for cv in db["sample_data"].find():
itemId = cv["itemId"]
executor.submit(intersect_compute, jdId, itemId)
# print "done {} => {}".format(jdId, itemId)

但我得到错误:

UserWarning: MongoClient opened before fork. Create MongoClient with connect=False, or create client after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#using-pymongo-with-multiprocessing>

根据文档,如您所见,我已将 connect 设置为 False

最佳答案

您所做的与文档中的完全一样(URL 异常(exception)),但在 Never do this 部分。
p.s.我在评论末尾更新了您的代码示例。

在每个进程中创建到数据库的连接:

# Each process creates its own instance of MongoClient.
def func():
db = pymongo.MongoClient().mydb
# Do something with db.

proc = multiprocessing.Process(target=func)
proc.start()

永远不要这样做:

client = pymongo.MongoClient()

# Each child process attempts to copy a global MongoClient
# created in the parent process. Never do this.
def func():
db = client.mydb
# Do something with db.

proc = multiprocessing.Process(target=func)
proc.start()

您需要更改的是将数据库连接初始化移动到每个进程的一个分支。因为他们每个人都有自己独立的连接。

您的示例已更新:

with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
client = MongoClient(JD_SEARCH_MONGO_URI, connect=False)
db = client.jd_search

for jd in db['sample_data'].find():
jdId = jd["jdId"]
for cv in db["sample_data"].find():
itemId = cv["itemId"]
executor.submit(intersect_compute, jdId, itemId)
# print "done {} => {}".format(jdId, itemId)

关于python - pymongo MongoClient 不能在多进程中工作?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34782789/

42 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com