gpt4 book ai didi

c++ - OpenMP for 循环和指针

转载 作者:行者123 更新时间:2023-11-28 05:34:24 24 4
gpt4 key购买 nike

我对 openmp 没有太多经验。

是否可以通过在指针上使用 for 循环而不是索引来使以下代码更快?

有没有办法让下面的代码更快?

代码将一个数组乘以一个常量。

谢谢。

代码:

#include <iostream>
#include <stdlib.h>
#include <stdint.h>
#include <vector>
using namespace std;
int main(void){
size_t dim0, dim1;
dim0 = 100;
dim1 = 200;
std::vector<float> vec;
vec.resize(dim0*dim1);
float scalar = 0.9;
size_t size_sq = dim0*dim1;
#pragma omp parallel
{
#pragma omp for
for(size_t i = 0; i < size_sq; ++i){
vec[i] *= scalar;
}
}
}

串行指针循环

float* ptr_start = vec.data();
float* ptr_end = ptr_start + dim0*dim1;
float* ptr_now;
for(ptr_now = ptr_start; ptr_now != ptr_end; ++ptr_now){
*(ptr_now) *= scalar;
}

最佳答案

串行指针循环应该是这样的

size_t size_sq = vec.size();
float * ptr = vec.data();
#pragma omp parallel
{
#pragma omp for
for(size_t i = 0; i < size_sq; i++){
ptr[i] *= scalar;
}
}

ptr 对所有线程都是一样的,所以没问题。

作为解释,Data sharing attribute clauses (wikipedia) :

shared: the data within a parallel region is shared, which means visible and accessible by all threads simultaneously. By default, all variables in the work sharing region are shared except the loop iteration counter.

private: the data within a parallel region is private to each thread, which means each thread will have a local copy and use it as a temporary variable. A private variable is not initialized and the value is not maintained for use outside the parallel region. By default, the loop iteration counters in the OpenMP loop constructs are private.

在这种情况下,i 是私有(private)的,ptr 是共享的。

关于c++ - OpenMP for 循环和指针,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38664619/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com