gpt4 book ai didi

python - 'Top K most frequent elements' 的最差运行时复杂度分析

转载 作者:塔克拉玛干 更新时间:2023-11-03 06:39:40 25 4
gpt4 key购买 nike

问题描述:

Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Eg: Example 1: Input: ["i", "love", "stackoverflow", "i", "love", "coding"], k = 2 Output: ["i", "love"] Explanation: "i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order.

我使用频率桶的 Python 解决方案:

def topKFrequent(words, k):       
wordCount = collections.Counter(words)
freq = [[] for i in range(len(words) + 1)]
res = []
for word, count in wordCount.items():
freq[count].append(word)
for i in range(len(freq) - 1, 0, -1):
if k == 0:
break
elif k >= len(freq[i]):
res.extend(sorted(freq[i]))
k -= len(freq[i])
else:
res.extend(sorted(freq[i])[:k])
break
return res

现在,我的论点是上面的运行时间为 O(nlogn),忽略了 Counter 初始化和 freq 初始化,它们都是 O(n),在最坏的情况下,最终循环将有一个桶其中的所有单词(每个单词只出现一次),所以我们最终只对该桶进行排序,即 nlog(n)。

以上的直觉分析是否正确?

最佳答案

是的,您的分析是正确的。如果您有 n 个单词,那么您的初始化步骤将在 O(n) 中运行。然后您的 for 循环对每个 j 分区执行 O(m log m) 排序。这是一个证明:

Let L be a list of n elements. Partition L into j different partitions, each containing n_1, ..., n_j elements. Clearly n_1 + ... + n_j = n and 1 <= j <= n.

We can ignore the iterations that do not process any items since they are bounded by a constant n operations. Thus the for loop does work on j iterations, and in each one does O(n_i log n_i) work. Thus, each of those iterations is bounded by C_i n_i log n_i for suitable constants C_i. The total work is then C_1 n_1 log n_1 + ... + C_j n_j log n_j. Suppose K is the largest of the C_i. Then this is bounded above by K n_1 log n + ... + K n_j log n = K (n_1 + ... + n_j) log n = K n log n. Therefore the loop runs in O(n log n) time.

我会注意到有一个 n log k 算法涉及使用最小堆,其中您最多保留 k 元素...

关于python - 'Top K most frequent elements' 的最差运行时复杂度分析,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53000122/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com