gpt4 book ai didi

python - 合并在python中加入两个生成器

转载 作者:行者123 更新时间:2023-11-28 22:05:35 30 4
gpt4 key购买 nike

我想按键合并两个 kyoto cabinet b-tree 数据库。(kyoto cabinet python api)。结果列表应包含两个输入数据库中任何一个的每个唯一键(及其值)。

以下代码有效,但我认为它很丑。
left_generator/right_generator 是两个 cursor对象。如果生成器耗尽,get() 返回 None 尤其奇怪。

def merge_join_kv(left_generator, right_generator):
stop = False
while left_generator.get() or right_generator.get():
try:
comparison = cmp(right_generator.get_key(), left_generator.get_key())
if comparison == 0:
yield left_generator.get_key(), left_generator.get_value()
left_generator.next()
right_generator.next()
elif (comparison < 0) or (not left_generator.get() or not right_generator.get()):
yield right_generator.get_key(), right_generator.get_value()
right_generator.next()
else:
yield left_generator.get_key(), left_generator.get_value()
left_generator.next()
except StopIteration:
if stop:
raise
stop = True

一般来说:是否有一个函数/库将生成器与 cmp() 合并在一起?

最佳答案

我认为这就是您所需要的; orderedMerge 基于 Gnibbler 的代码,但添加了自定义键函数和唯一参数,

import kyotocabinet
import collections
import heapq

class IterableCursor(kyotocabinet.Cursor, collections.Iterator):
def __init__(self, *args, **kwargs):
kyotocabinet.Cursor.__init__(self, *args, **kwargs)
collections.Iterator.__init__(self)

def next():
"Return (key,value) pair"
res = self.get(True)
if res is None:
raise StopIteration
else:
return res

def orderedMerge(*iterables, **kwargs):
"""Take a list of ordered iterables; return as a single ordered generator.

@param key: function, for each item return key value
(Hint: to sort descending, return negated key value)

@param unique: boolean, return only first occurrence for each key value?
"""
key = kwargs.get('key', (lambda x: x))
unique = kwargs.get('unique', False)

_heapify = heapq.heapify
_heapreplace = heapq.heapreplace
_heappop = heapq.heappop
_StopIteration = StopIteration

# preprocess iterators as heapqueue
h = []
for itnum, it in enumerate(map(iter, iterables)):
try:
next = it.next
data = next()
keyval = key(data)
h.append([keyval, itnum, data, next])
except _StopIteration:
pass
_heapify(h)

# process iterators in ascending key order
oldkeyval = None
while True:
try:
while True:
keyval, itnum, data, next = s = h[0] # get smallest-key value
# raises IndexError when h is empty
# if unique, skip duplicate keys
if unique and keyval==oldkeyval:
pass
else:
yield data
oldkeyval = keyval

# load replacement value from same iterator
s[2] = data = next() # raises StopIteration when exhausted
s[0] = key(data)
_heapreplace(h, s) # restore heap condition
except _StopIteration:
_heappop(h) # remove empty iterator
except IndexError:
return

那么你的函数可以这样来完成

from operator import itemgetter

def merge_join_kv(leftGen, rightGen):
# assuming that kyotocabinet.Cursor has a copy initializer
leftIter = IterableCursor(leftGen)
rightIter = IterableCursor(rightGen)

return orderedMerge(leftIter, rightIter, key=itemgetter(0), unique=True)

关于python - 合并在python中加入两个生成器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5023266/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com