gpt4 book ai didi

c - 用于乘以特定数组元素的等效 SIMD 指令

转载 作者:行者123 更新时间:2023-12-04 12:33:25 26 4
gpt4 key购买 nike

我刚刚了解了如何获得 2 个数组的点积(如以下代码所示):

int A[8] = {1,2,3,4,5,1,2,3};
int B[8] = {2,3,4,5,6,2,3,4};

float result = 0;

for (int i = 0; i < 8; i ++) {
result += A[i] * B[i];
}

等同于(在 SIMD 中):

int A[8] = {1,2,3,4,5,1,2,3};
int B[8] = {2,3,4,5,6,2,3,4};

float result = 0;

__m128 r1 = {0,0,0,0};
__m128 r2 = {0,0,0,0};
__m128 r3 = {0,0,0,0};

for (int i = 0; i < 8; i += 4) {
float C[4] = {A[i], A[i+1], A[i+2], A[i+3]};
float D[4] = {B[i], B[i+1], B[i+2], B[i+3]};
__m128 a = _mm_loadu_ps(C);
__m128 b = _mm_loadu_ps(D);

r1 = _mm_mul_ps(a,b);
r2 = _mm_hadd_ps(r1, r1);
r3 = _mm_add_ss(_mm_hadd_ps(r2, r2), r3);
_mm_store_ss(&result, r3);
}

我现在很好奇,如果我想乘以数组中不连续的元素,如何在 SIMD 中获得等效代码。例如,如果我想执行以下操作,SIMD 中的等效项是什么?

int A[8] = {1,2,3,4,5,1,2,3};
int B[8] = {2,3,4,5,6,2,3,4};

float result = 0;
for (int i = 0; i < 8; i++) {
for (int j = 0; j < 8; j++) {
result += A[foo(i)] * B[foo(j)]
}
}

foo 只是一些函数,它返回一个 int 作为输入参数的某个函数。

最佳答案

如果我必须做这个任务,我会做如下:

int A[8] = {1,2,3,4,5,1,2,3};
int B[8] = {2,3,4,5,6,2,3,4};

float PA[8], PB[8];
for (int i = 0; i < 8; i++)
{
PA[i] = A[foo(i)];
PB[i] = B[foo(i)];
}

__m128 sums = _mm_set1_ps(0);
for (int i = 0; i < 8; i++)
{
__m128 a = _mm_set1_ps(PA[i]);
for (int j = 0; j < 8; j += 4)
{
__m128 b = _mm_loadu_ps(PB + j);
sums = _mm_add_ps(sums, _mm_mul_ps(a, b));
}
}
float results[4];
_mm_storeu_ps(results, sums);
float result = results[0] + results[1] + results[2] + results[3];

关于c - 用于乘以特定数组元素的等效 SIMD 指令,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32083301/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com