gpt4 book ai didi

c++ - c++ 中汉明距离的更快形式(可能利用标准库)?

转载 作者:塔克拉玛干 更新时间:2023-11-03 01:29:21 24 4
gpt4 key购买 nike

我有两个 int vectorsa[100], b[100].
计算它们的汉明距离的简单方法是:

std::vector<int> a(100);
std::vector<int> b(100);

double dist = 0;
for(int i = 0; i < 100; i++){
if(a[i] != b[i])
dist++;
}
dist /= a.size();

我想问一下,在 C++ 中有没有更快的方法来完成这个计算,或者如何使用 STL 来完成同样的工作?

最佳答案

您要求更快的方法。这是 embarrassingly parallel problem ,因此,对于 C++,您可以通过两种方式利用它:线程并行性和通过优化进行矢量化。

//The following flags allow cpu specific vectorization optimizations on *my cpu*
//clang++ -march=corei7-avx hd.cpp -o hd -Ofast -pthread -std=c++1y
//g++ -march=corei7-avx hd.cpp -o hd -Ofast -pthread -std=c++1y

#include <vector>
#include <thread>
#include <future>
#include <numeric>

template<class T, class I1, class I2>
T hamming_distance(size_t size, I1 b1, I2 b2) {
return std::inner_product(b1, b1 + size, b2, T{},
std::plus<T>(), std::not_equal_to<T>());
}

template<class T, class I1, class I2>
T parallel_hamming_distance(size_t threads, size_t size, I1 b1, I2 b2) {
if(size < 1000)
return hamming_distance<T, I1, I2>(size, b1, b2);

if(threads > size)
threads = size;

const size_t whole_part = size / threads;
const size_t remainder = size - threads * whole_part;

std::vector<std::future<T>> bag;
bag.reserve(threads + (remainder > 0 ? 1 : 0));

for(size_t i = 0; i < threads; ++i)
bag.emplace_back(std::async(std::launch::async,
hamming_distance<T, I1, I2>,
whole_part,
b1 + i * whole_part,
b2 + i * whole_part));
if(remainder > 0)
bag.emplace_back(std::async(std::launch::async,
hamming_distance<T, I1, I2>,
remainder,
b1 + threads * whole_part,
b2 + threads * whole_part));

T hamming_distance = 0;
for(auto &f : bag) hamming_distance += f.get();
return hamming_distance;
}

#include <ratio>
#include <random>
#include <chrono>
#include <iostream>
#include <cinttypes>

int main() {
using namespace std;
using namespace chrono;

random_device rd;
mt19937 gen(rd());
uniform_int_distribution<> random_0_9(0, 9);

const auto size = 100 * mega::num;
vector<int32_t> v1(size);
vector<int32_t> v2(size);

for(auto &x : v1) x = random_0_9(gen);
for(auto &x : v2) x = random_0_9(gen);

cout << "naive hamming distance: ";
const auto naive_start = high_resolution_clock::now();
cout << hamming_distance<int32_t>(v1.size(), begin(v1), begin(v2)) << endl;
const auto naive_elapsed = high_resolution_clock::now() - naive_start;

const auto n = thread::hardware_concurrency();

cout << "parallel hamming distance: ";
const auto parallel_start = high_resolution_clock::now();
cout << parallel_hamming_distance<int32_t>(
n,
v1.size(),
begin(v1),
begin(v2)
)
<< endl;
const auto parallel_elapsed = high_resolution_clock::now() - parallel_start;

auto count_microseconds =
[](const high_resolution_clock::duration &elapsed) {
return duration_cast<microseconds>(elapsed).count();
};

cout << "naive delay: " << count_microseconds(naive_elapsed) << endl;
cout << "parallel delay: " << count_microseconds(parallel_elapsed) << endl;
}

请注意,我没有对 vector 大小进行除法

我的机器的结果(这表明对于只有 2 个物理内核的机器来说它并没有得到太多...):

$ clang++ -march=corei7-avx hd.cpp -o hd -Ofast -pthread -std=c++1y -stdlib=libc++ -lcxxrt -ldl
$ ./hd
naive hamming distance: 89995190
parallel hamming distance: 89995190
naive delay: 52758
parallel delay: 47227

$ clang++ hd.cpp -o hd -O3 -pthread -std=c++1y -stdlib=libc++ -lcxxrt -ldl
$ ./hd
naive hamming distance: 90001042
parallel hamming distance: 90001042
naive delay: 53851
parallel delay: 46887

$ g++ -march=corei7-avx hd.cpp -o hd -Ofast -pthread -std=c++1y -Wl,--no-as-needed
$ ./hd
naive hamming distance: 90001825
parallel hamming distance: 90001825
naive delay: 55229
parallel delay: 49355

$ g++ hd.cpp -o hd -O3 -pthread -std=c++1y -Wl,--no-as-needed
$ ./hd
naive hamming distance: 89996171
parallel hamming distance: 89996171
naive delay: 54189
parallel delay: 44928

我还发现自动矢量化没有影响,可能需要检查程序集...

有关矢量化和编译器选项的示例,请查看 blog post of mine .

关于c++ - c++ 中汉明距离的更快形式(可能利用标准库)?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21092245/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com