gpt4 book ai didi

c++ - 为什么 for 循环中的异步不会提高执行时间?

转载 作者:行者123 更新时间:2023-12-01 14:49:46 25 4
gpt4 key购买 nike

我试图了解并发性,因此我尝试从 A Tour of C++ (second edition) 15.7.3, page 205 编写 Stroustrup 示例代码 (comp4()) 的更灵活版本 (my_comp())。
它给出了正确的答案,但它没有使用并发来提高执行时间。我的问题是:为什么 my_comp() 没有按预期运行,我该如何解决?

#include <iostream>
#include <chrono>
#include <cmath>
#include <vector>
#include <numeric>
#include <future>
#include <fstream>

using namespace std;
using namespace std::chrono;

constexpr auto sz = 500'000'000;
constexpr int conc_num{ 4 };

double accum(double* beg, double* end, double init)
{
return accumulate(beg, end, init);
}

double comp4(vector<double>& v)
//From Stroustrup, A Tour of C++ (Second edition)
//15.7.3 page 205
{
auto v0 = &v[0];
auto sz = v.size();

auto f0 = async(accum, v0, v0 + sz / 4, 0.0);
auto f1 = async(accum, v0 + sz / 4, v0 + sz / 2, 0.0);
auto f2 = async(accum, v0 + sz / 2, v0 + sz * 3 / 4, 0.0);
auto f3 = async(accum, v0 + sz * 3 / 4, v0 + sz, 0.0);

return f0.get() + f1.get() + f2.get() + f3.get();
}

double my_comp(vector<double>& v, int conc = 1)
//My idea of a more flexible version of comp4
{
if (conc < 1)
conc = 1;
auto v0 = &v[0];
auto sz = v.size();

vector<future<double>> fv(conc);
for (int i = 0; i != conc; ++i) {
auto f = async(accum, v0 + sz * (i / conc), v0 + sz * ((i + 1) / (conc)), 0.0);
fv[i] = move(f);
}
double ret{ 0.0 };
for (int i = 0; i != fv.size(); ++i) {
ret += fv[i].get();
}
return ret;
}

int main()
{
cout << "Calculating ..." << "\n\n";
auto tv0 = high_resolution_clock::now();
vector<double> vc;
vc.reserve(sz);
for (int i = 0; i != sz; ++i) {
vc.push_back(sin(i)); //Arbitrary test function
}
auto tv1 = high_resolution_clock::now();
auto durtv = duration_cast<milliseconds>(tv1 - tv0).count();
cout << "vector of size " << vc.size() << ": " << durtv << " msec\n\n";

////////////////////////////////////////////
auto vc_test = vc;
auto t0 = high_resolution_clock::now();
auto s1 = accumulate(vc_test.begin(), vc_test.end(), 0.0);
auto t1 = high_resolution_clock::now();
auto dur1 = duration_cast<milliseconds>(t1 - t0).count();
///////////////////////////////////////////
vc_test = vc;
auto tt0 = high_resolution_clock::now();
auto s2 = my_comp(vc_test, conc_num); //Should be faster
auto tt1 = high_resolution_clock::now();
auto dur2 = duration_cast<milliseconds>(tt1 - tt0).count();
////////////////////////////////////////////
vc_test = vc;
auto ttt0 = high_resolution_clock::now();
auto s3 = comp4(vc_test); //Really is faster
auto ttt1 = high_resolution_clock::now();
auto dur3 = duration_cast<milliseconds>(ttt1 - ttt0).count();
///////////////////////////////////////////

cout << dur1 << " msec\n";
cout << "Output = " << s1 << " (accumulate)" << "\n\n";
cout << dur2 << " msec" << " Ratio: " << double(dur2) / double(dur1) << "\n";
cout << "Output = " << s2 << " (my_comp)" << "\n\n";
cout << dur3 << " msec" << " Ratio: " << double(dur3) / double(dur1) << "\n";
cout << "Output = " << s3 << " (comp4)" << "\n\n";
}

使用 Visual C++ 2019(ISO C++17 标准 (/std:c++17))X64 版本编译。典型的输出是:

424 毫秒
输出 = 1.93496(累积)

431 毫秒比率:1.01651
输出 = 1.93496 (my_comp)

117 毫秒比率:0.275943
输出 = 1.93496 (comp4)

我知道并行算法和 std::reduce。我的问题不是如何优化这个特定的计算,而是了解如何编写符合预期的并发代码。

最佳答案

您的问题在这里:(i / conc) .曾经0 <= i < conc , 和 iconc是整数,这意味着此计算始终为零。

要解决您的问题,请删除括号:

auto f = async(accum, v0 + sz * i / conc, v0 + sz * (i + 1) / conc, 0.0);

关于c++ - 为什么 for 循环中的异步不会提高执行时间?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58701516/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com