python - 如何在 Python 中实现优先级队列？-6ren

python - 如何在 Python 中实现优先级队列？

转载作者：IT老高更新时间：2023-10-28 20:24:16

很抱歉提出这么愚蠢的问题，但 Python 文档令人困惑......

链接 1:队列实现 http://docs.python.org/library/queue.html

它说 Queue 有一个优先级队列的类。但我找不到如何实现它。

class Queue.PriorityQueue(maxsize=0)

链接 2:堆实现 http://docs.python.org/library/heapq.html

这里他们说我们可以使用 heapq 间接实现优先级队列

pq = []                         # list of entries arranged in a heap
entry_finder = {}               # mapping of tasks to entries
REMOVED = '<removed-task>'      # placeholder for a removed task
counter = itertools.count()     # unique sequence count

def add_task(task, priority=0):
    'Add a new task or update the priority of an existing task'
    if task in entry_finder:
        remove_task(task)
    count = next(counter)
    entry = [priority, count, task]
    entry_finder[task] = entry
    heappush(pq, entry)

def remove_task(task):
    'Mark an existing task as REMOVED.  Raise KeyError if not found.'
    entry = entry_finder.pop(task)
    entry[-1] = REMOVED

def pop_task():
    'Remove and return the lowest priority task. Raise KeyError if empty.'
    while pq:
        priority, count, task = heappop(pq)
        if task is not REMOVED:
            del entry_finder[task]
            return task
    raise KeyError('pop from an empty priority queue'

Python 中最有效的优先级队列实现是什么？以及如何实现？

最佳答案

在任何语言中都没有“最有效的优先级队列实现”之类的东西。

优先级队列是关于权衡的。见 http://en.wikipedia.org/wiki/Priority_queue

您应该根据您打算如何使用它来选择这两者之一:

O(log(N)) 插入时间和 O(1) (findMin+deleteMin)* 时间，或
O(1) 插入时间和O(log(N)) (findMin+deleteMin)*时间

(* sidenote: the findMin time of most queues is almost always O(1), sohere I mostly mean the deleteMin time can either be O(1) quick if theinsertion time is O(log(N)) slow, or the deleteMin time must beO(log(N)) slow if the insertion time is O(1) fast. One should note thatboth may also be unnecessarily slow like with binary-tree basedpriority queues.)

在后一种情况下，您可以选择使用斐波那契堆实现优先级队列:http://en.wikipedia.org/wiki/Heap_(data_structure)#Comparison_of_theoretic_bounds_for_variants (如您所见，heapq 基本上是一棵二叉树，必须有 O(log(N)) 用于插入和 findMin+deleteMin)

如果是处理特殊属性的数据(比如有界数据)，那么可以实现O(1)插入和O(1) findMin+deleteMin时间。您只能对某些类型的数据执行此操作，否则您可能会滥用您的优先级队列来违反 O(N log(N)) 对排序的约束。 vEB 树属于类似的类别，因为您有一个最大的集合大小(O(log(log(M)) 不是指元素的数量，而是指元素的最大数量)因此您无法规避理论上的 O(N log(N)) 通用比较排序界限。

要以任何语言实现任何队列，您只需定义 insert(value) 和 extractMin() -> value 操作。这通常只涉及底层堆的最小包装；见 http://en.wikipedia.org/wiki/Fibonacci_heap实现自己的，或使用类似堆的现成库，如配对堆(Google 搜索显示 http://svn.python.org/projects/sandbox/trunk/collections/pairing_heap.py)

如果您只关心您引用的两个中哪一个更有效(您在上面包含的 http://docs.python.org/library/heapq.html#priority-queue-implementation-notes 中基于 heapq 的代码，与 Queue.PriorityQueue )，然后:

对于 Queue.PriorityQueue 实际在做什么，似乎没有任何容易在网络上找到的讨论；您必须深入研究代码，该代码链接到帮助文档:http://hg.python.org/cpython/file/2.7/Lib/Queue.py

   224     def _put(self, item, heappush=heapq.heappush):
   225         heappush(self.queue, item)
   226 
   227     def _get(self, heappop=heapq.heappop):
   228         return heappop(self.queue)

正如我们所见，Queue.PriorityQueue 也使用 heapq 作为底层机制。因此它们同样糟糕(渐近地说)。 Queue.PriorityQueue 可能允许并行查询，所以我敢打赌它可能有一个非常轻微的常数因子更多的开销。但是因为您知道底层实现(和渐近行为)必须相同，所以最简单的方法就是在同一个大型数据集上运行它们。

(请注意，Queue.PriorityQueue 似乎没有删除条目的方法，而 heapq 有。但是这是一把双刃剑:良好的优先级队列实现可能允许您在 O(1) 或 O(log(N)) 时间内删除元素，但如果您使用您提到的 remove_task 函数，并让那些僵尸任务累积在您的队列中因为你没有从最小值中提取它们，所以你会看到你不会看到的渐近减速。当然，你不能首先使用 Queue.PriorityQueue 来做到这一点，所以这里不能做比较。)

关于python - 如何在 Python 中实现优先级队列？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/9969236/