gpt4 book ai didi

c++ - C++ 中 n 个排序数组的有效 union (集合与 vector )?

转载 作者:塔克拉玛干 更新时间:2023-11-03 04:14:56 27 4
gpt4 key购买 nike

我需要实现一个高效的算法来从多个排序的数组中找到一个排序的并集。由于我的程序做了很多这类操作,所以我用 C++ 模拟了它。我的第一种方法(方法 1)是简单地创建一个空 vector 并将其他 vector 中的每个元素附加到空 vector ,然后使用 std::sort 和 std::unique 获得所有元素的所需排序并集。但是,我认为将所有 vector 元素转储到一个集合中(方法 2)可能更有效,因为集合已经使它们唯一并一次性排序。令我惊讶的是,方法 1 比方法 2 快 5 倍!我在这里做错了什么吗? method2 不应该因为计算量少而更快吗?提前致谢

////带有 vector 的方法 1:

std::vector<long> arr1{5,12,32,33,34,50};
std::vector<long> arr2{1,2,3,4,5};
std::vector<long> arr3{1,8,9,11};

std::vector<long> arr;

int main(int argc, const char * argv[]) {

double sec;
clock_t t;
t=clock();
for(long j=0; j<1000000; j++){ // repeating for benchmark
arr.clear();
for(long i=0; i<arr1.size(); i++){
arr.push_back(arr1[i]);
}
for(long i=0; i<arr2.size(); i++){
arr.push_back(arr2[i]);
}
for(long i=0; i<arr3.size(); i++){
arr.push_back(arr3[i]);
}
std::sort(arr.begin(), arr.end());
auto last = std::unique(arr.begin(), arr.end());
arr.erase(last, arr.end());
}
t=clock() - t;
sec = (double)t/CLOCKS_PER_SEC;
std::cout<<"seconds = "<< sec <<" clicks = " << t << std::endl;

return 0;
}

////带集合的方法 2:

std::vector<long> arr1{5,12,32,33,34,50};
std::vector<long> arr2{1,2,3,4,5};
std::vector<long> arr3{1,8,9,11};

std::set<long> arr;

int main(int argc, const char * argv[]) {

double sec;
clock_t t;
t=clock();
for(long j=0; j<1000000; j++){ //repeating for benchmark
arr.clear();
arr.insert(arr1.begin(), arr1.end());
arr.insert(arr2.begin(), arr2.end());
arr.insert(arr3.begin(), arr3.end());
}
t=clock() - t;
sec = (double)t/CLOCKS_PER_SEC;
std::cout<<"seconds = "<< sec <<" clicks = " << t << std::endl;

return 0;
}

最佳答案

这是使用 2 个 vector 完成的方法。您可以轻松地将此过程概括为 N 个 vector 。

vector<int> v1{ 4, 8, 12, 16 };
vector<int> v2{ 2, 6, 10, 14 };

vector<int> merged;
merged.reserve(v1.size() + v2.size());

// An iterator on each vector
auto it1 = v1.begin();
auto it2 = v2.begin();

while (it1 != v1.end() && it2 != v2.end())
{
// Find the iterator that points to the smallest number.
// Grab the value.
// Advance the iterator, and repeat.

if (*it1 < *it2)
{
if (merged.empty() || merged.back() < *it1)
merged.push_back(*it1);
++it1;
}
else
{
if (merged.empty() || merged.back() < *it2)
merged.push_back(*it2);
++it2;
}
}

while(it1 != v1.end())
{
merged.push_back(*it1);
++it1;
}

while (it2 != v2.end())
{
merged.push_back(*it2);
++it2;
}

// if you print out the values in 'merged', it gives the expected result
[2, 4, 6, 8, 10, 12, 14, 16]

...您可以概括为以下内容。请注意,包含“当前”迭代器和结束迭代器的辅助结构会更简洁,但想法保持不变。

vector<int> v1{ 4, 8, 12, 16 };
vector<int> v2{ 2, 6, 10, 14 };
vector<int> v3{ 3, 7, 11, 15 };
vector<int> v4{ 0, 21};

vector<int> merged;
// reserve space accordingly...

using vectorIt = vector<int>::const_iterator;

vector<vectorIt> fwdIterators;
fwdIterators.push_back(v1.begin());
fwdIterators.push_back(v2.begin());
fwdIterators.push_back(v3.begin());
fwdIterators.push_back(v4.begin());
vector<vectorIt> endIterators;
endIterators.push_back(v1.end());
endIterators.push_back(v2.end());
endIterators.push_back(v3.end());
endIterators.push_back(v4.end());

while (!fwdIterators.empty())
{
// Find out which iterator carries the smallest value
size_t index = 0;
for (size_t i = 1; i < fwdIterators.size(); ++i)
{
if (*fwdIterators[i] < *fwdIterators[index])
index = i;
}

if (merged.empty() || merged.back() < *fwdIterators[index])
merged.push_back(*fwdIterators[index]);

++fwdIterators[index];
if (fwdIterators[index] == endIterators[index])
{
fwdIterators.erase(fwdIterators.begin() + index);
endIterators.erase(endIterators.begin() + index);
}
}

// again, merged contains the expected result
[0, 2, 3, 4, 6, 7, 8, 10, 11, 12, 14, 15, 16, 21]

...正如一些人指出的那样,使用堆会更快

// Helper struct to make it more convenient
struct Entry
{
vector<int>::const_iterator fwdIt;
vector<int>::const_iterator endIt;

Entry(vector<int> const& v) : fwdIt(v.begin()), endIt(v.end()) {}
bool IsAlive() const { return fwdIt != endIt; }
bool operator< (Entry const& rhs) const { return *fwdIt > *rhs.fwdIt; }
};


int main()
{
vector<int> v1{ 4, 8, 12, 16 };
vector<int> v2{ 2, 6, 10, 14 };
vector<int> v3{ 3, 7, 11, 15 };
vector<int> v4{ 0, 21};

vector<int> merged;
merged.reserve(v1.size() + v2.size() + v3.size() + v4.size());

std::priority_queue<Entry> queue;
queue.push(Entry(v1));
queue.push(Entry(v2));
queue.push(Entry(v3));
queue.push(Entry(v4));

while (!queue.empty())
{
Entry tmp = queue.top();
queue.pop();

if (merged.empty() || merged.back() < *tmp.fwdIt)
merged.push_back(*tmp.fwdIt);

tmp.fwdIt++;

if (tmp.IsAlive())
queue.push(tmp);
}

虽然看起来确实有很多“Entry”对象的复制,但对于 std::priority_queue 来说,指向具有适当比较函数的条目的指针可能会更好。

关于c++ - C++ 中 n 个排序数组的有效 union (集合与 vector )?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52634460/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com