gpt4 book ai didi

c++ - 两个8位数组协方差的快速实现

转载 作者:塔克拉玛干 更新时间:2023-11-03 01:27:38 25 4
gpt4 key购买 nike

我需要比较大量相似的小尺寸图片(最大 200x200)。所以我尝试实现 SSIM(结构相似性参见 https://en.wikipedia.org/wiki/Structural_similarity)算法。SSIM 需要计算两个 8 位灰度图像的协方差。一个简单的实现如下所示:

float SigmaXY(const uint8_t * x, const uint8_t * y, size_t size, float averageX, float averageY)
{
float sum = 0;
for(size_t i = 0; i < size; ++i)
sum += (x[i] - averageX) * (y[i] - averageY);
return sum / size;
}

但是性能很差。所以我希望用SIMD或者CUDA来改进一下(听说可以)。不幸的是,我没有这样做的经验。它看起来如何?我要去哪里?

最佳答案

我有另一个不错的解决方案!

首先我想提一些数学公式:

averageX = Sum(x[i])/size;
averageY = Sum(y[i])/size;

因此:

Sum((x[i] - averageX)*(y[i] - averageY))/size = 

Sum(x[i]*y[i])/size - Sum(x[i]*averageY)/size -
Sum(averageX*y[i])/size + Sum(averageX*averageY)/size =

Sum(x[i]*y[i])/size - averageY*Sum(x[i])/size -
averageX*Sum(y[i])/size + averageX*averageY*Sum(1)/size =

Sum(x[i]*y[i])/size - averageY*averageX -
averageX*averageY + averageX*averageY =

Sum(x[i]*y[i])/size - averageY*averageX;

它允许修改我们的算法:

float SigmaXY(const uint8_t * x, const uint8_t * y, size_t size, float averageX, float averageY)
{
uint32_t sum = 0; // If images will have size greater then 256x256 than you have to use uint64_t.
for(size_t i = 0; i < size; ++i)
sum += x[i]*y[i];
return sum / size - averageY*averageX;
}

只有在那之后我们才能使用 SIMD(我使用 SSE2):

#include <emmintrin.h>

inline __m128i SigmaXY(__m128i x, __m128i y)
{
__m128i lo = _mm_madd_epi16(_mm_unpacklo_epi8(x, _mm_setzero_si128()), _mm_unpacklo_epi8(y, _mm_setzero_si128()));
__m128i hi = _mm_madd_epi16(_mm_unpackhi_epi8(y, _mm_setzero_si128()), _mm_unpackhi_epi8(y, _mm_setzero_si128()));
return _mm_add_epi32(lo, hi);
}

float SigmaXY(const uint8_t * x, const uint8_t * y, size_t size, float averageX, float averageY)
{
uint32_t sum = 0;
size_t i = 0, alignedSize = size/16*16;
if(size >= 16)
{
__m128i sums = _mm_setzero_si128();
for(; i < alignedSize; i += 16)
{
__m128i _x = _mm_loadu_si128((__m128i*)(x + i));
__m128i _y = _mm_loadu_si128((__m128i*)(y + i));
sums = _mm_add_epi32(sums, SigmaXY(_x, _y));
}
uint32_t _sums[4];
_mm_storeu_si128(_sums, sums);
sum = _sums[0] + _sums[1] + _sums[2] + _sums[3];
}
for(; i < size; ++i)
sum += x[i]*y[i];
return sum / size - averageY*averageX;
}

关于c++ - 两个8位数组协方差的快速实现,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35216851/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com