gpt4 book ai didi

mongodb - 使用 Pymongo 的并行扫描时找不到游标

转载 作者:可可西里 更新时间:2023-11-01 10:43:42 24 4
gpt4 key购买 nike

我有一个 mongo 数据库,其中包含我使用 pymongo 处理的 3.000.000 份文档。我想在不更新集合的情况下遍历所有文档。我尝试使用四个线程来做到这一点:

cursors = db[collection].parallel_scan(CURSORS_NUM)
threads = [
threading.Thread(target=process_cursor, args=(cursor, )) for cursor in cursors
]

for thread in threads:
thread.start()

for thread in threads:
thread.join()

以及进程游标函数:

def process_cursor(cursor):
for document in cursor:
dosomething(document)

处理文档一段时间后,我收到错误:

  File "extendDocuments.py", line 133, in process_cursor
for document in cursor:
File "/usr/local/lib/python2.7/dist-packages/pymongo/command_cursor.py", line 165, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python2.7/dist-packages/pymongo/command_cursor.py", line 142, in _refresh
self.__batch_size, self.__id))
File "/usr/local/lib/python2.7/dist-packages/pymongo/command_cursor.py", line 110, in __send_message
*self.__decode_opts)
File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 97, in _unpack_response
cursor_id)
CursorNotFound: cursor id '116893918402' not valid at server

如果我改用 find() ,我可以将超时设置为 false 以避免这种情况。我可以对从并行扫描中获得的游标执行类似的操作吗?

最佳答案

目前无法为从 parallelCollectionScan 返回的游标关闭空闲超时。我已经打开了一个功能请求:

https://jira.mongodb.org/browse/SERVER-15042

关于mongodb - 使用 Pymongo 的并行扫描时找不到游标,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24997404/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com