gpt4 book ai didi

c++ - 为什么 std::vector 更快?

转载 作者:塔克拉玛干 更新时间:2023-11-03 00:39:52 27 4
gpt4 key购买 nike

当我实现埃拉托色尼筛法时,我遇到了一个问题 std::vector<bool> : 无法访问原始数据。

所以我决定使用自定义的简约实现,这样我就可以访问数据指针。

#ifndef LIB_BITS_T_H
#define LIB_BITS_T_H

#include <algorithm>
template <typename B>

class bits_t{

public:

typedef B block_t;
static const size_t block_size = sizeof(block_t) * 8;

block_t* data;
size_t size;
size_t blocks;

class bit_ref{
public:
block_t* const block;
const block_t mask;

bit_ref(block_t& block, const block_t mask) noexcept : block(&block), mask(mask){}

inline void operator=(bool v) const noexcept{
if(v) *block |= mask;
else *block &= ~mask;
}

inline operator bool() const noexcept{
return (bool)(*block & mask);
}
};



bits_t() noexcept : data(nullptr){}

void resize(const size_t n, const bool v) noexcept{
block_t fill = v ? ~block_t(0) : block_t(0);
size = n;
blocks = (n + block_size - 1) / block_size;
data = new block_t[blocks];
std::fill(data, data + blocks, fill);
}

inline block_t& block_at_index(const size_t i) const noexcept{
return data[i / block_size];
}

inline size_t index_in_block(const size_t i) const noexcept{
return i % block_size;
}

inline bit_ref operator[](const size_t i) noexcept{
return bit_ref(block_at_index(i), block_t(1) << index_in_block(i));
}

~bits_t(){
delete[] data;
}

};

#endif // LIB_BITS_T_H

代码与/usr/include/c++/4.7/bits/STL_bvector.h 中的代码几乎相同,但速度较慢。

我尝试了一个优化,

#ifndef LIB_BITS_T_H
#define LIB_BITS_T_H

#include <algorithm>
template <typename B>

class bits_t{

const B mask[64] = {
0b0000000000000000000000000000000000000000000000000000000000000001,
0b0000000000000000000000000000000000000000000000000000000000000010,
0b0000000000000000000000000000000000000000000000000000000000000100,
0b0000000000000000000000000000000000000000000000000000000000001000,
0b0000000000000000000000000000000000000000000000000000000000010000,
0b0000000000000000000000000000000000000000000000000000000000100000,
0b0000000000000000000000000000000000000000000000000000000001000000,
0b0000000000000000000000000000000000000000000000000000000010000000,
0b0000000000000000000000000000000000000000000000000000000100000000,
0b0000000000000000000000000000000000000000000000000000001000000000,
0b0000000000000000000000000000000000000000000000000000010000000000,
0b0000000000000000000000000000000000000000000000000000100000000000,
0b0000000000000000000000000000000000000000000000000001000000000000,
0b0000000000000000000000000000000000000000000000000010000000000000,
0b0000000000000000000000000000000000000000000000000100000000000000,
0b0000000000000000000000000000000000000000000000001000000000000000,
0b0000000000000000000000000000000000000000000000010000000000000000,
0b0000000000000000000000000000000000000000000000100000000000000000,
0b0000000000000000000000000000000000000000000001000000000000000000,
0b0000000000000000000000000000000000000000000010000000000000000000,
0b0000000000000000000000000000000000000000000100000000000000000000,
0b0000000000000000000000000000000000000000001000000000000000000000,
0b0000000000000000000000000000000000000000010000000000000000000000,
0b0000000000000000000000000000000000000000100000000000000000000000,
0b0000000000000000000000000000000000000001000000000000000000000000,
0b0000000000000000000000000000000000000010000000000000000000000000,
0b0000000000000000000000000000000000000100000000000000000000000000,
0b0000000000000000000000000000000000001000000000000000000000000000,
0b0000000000000000000000000000000000010000000000000000000000000000,
0b0000000000000000000000000000000000100000000000000000000000000000,
0b0000000000000000000000000000000001000000000000000000000000000000,
0b0000000000000000000000000000000010000000000000000000000000000000,
0b0000000000000000000000000000000100000000000000000000000000000000,
0b0000000000000000000000000000001000000000000000000000000000000000,
0b0000000000000000000000000000010000000000000000000000000000000000,
0b0000000000000000000000000000100000000000000000000000000000000000,
0b0000000000000000000000000001000000000000000000000000000000000000,
0b0000000000000000000000000010000000000000000000000000000000000000,
0b0000000000000000000000000100000000000000000000000000000000000000,
0b0000000000000000000000001000000000000000000000000000000000000000,
0b0000000000000000000000010000000000000000000000000000000000000000,
0b0000000000000000000000100000000000000000000000000000000000000000,
0b0000000000000000000001000000000000000000000000000000000000000000,
0b0000000000000000000010000000000000000000000000000000000000000000,
0b0000000000000000000100000000000000000000000000000000000000000000,
0b0000000000000000001000000000000000000000000000000000000000000000,
0b0000000000000000010000000000000000000000000000000000000000000000,
0b0000000000000000100000000000000000000000000000000000000000000000,
0b0000000000000001000000000000000000000000000000000000000000000000,
0b0000000000000010000000000000000000000000000000000000000000000000,
0b0000000000000100000000000000000000000000000000000000000000000000,
0b0000000000001000000000000000000000000000000000000000000000000000,
0b0000000000010000000000000000000000000000000000000000000000000000,
0b0000000000100000000000000000000000000000000000000000000000000000,
0b0000000001000000000000000000000000000000000000000000000000000000,
0b0000000010000000000000000000000000000000000000000000000000000000,
0b0000000100000000000000000000000000000000000000000000000000000000,
0b0000001000000000000000000000000000000000000000000000000000000000,
0b0000010000000000000000000000000000000000000000000000000000000000,
0b0000100000000000000000000000000000000000000000000000000000000000,
0b0001000000000000000000000000000000000000000000000000000000000000,
0b0010000000000000000000000000000000000000000000000000000000000000,
0b0100000000000000000000000000000000000000000000000000000000000000,
0b1000000000000000000000000000000000000000000000000000000000000000
};

public:

typedef B block_t;
static const size_t block_size = sizeof(block_t) * 8;

block_t* data;
size_t size;
size_t blocks;

class bit_ref{
public:
block_t* const block;
const block_t mask;

bit_ref(block_t& block, const block_t mask) noexcept : block(&block), mask(mask){}

inline void operator=(bool v) const noexcept{
if(v) *block |= mask;
else *block &= ~mask;
}

inline operator bool() const noexcept{
return (bool)(*block & mask);
}
};



bits_t() noexcept : data(nullptr){}

void resize(const size_t n, const bool v) noexcept{
block_t fill = v ? ~block_t(0) : block_t(0);
size = n;
blocks = (n + block_size - 1) / block_size;
data = new block_t[blocks];
std::fill(data, data + blocks, fill);
}

inline block_t& block_at_index(const size_t i) const noexcept{
return data[i / block_size];
}

inline size_t index_in_block(const size_t i) const noexcept{
return i % block_size;
}

inline bit_ref operator[](const size_t i) noexcept{
return bit_ref(block_at_index(i), mask[index_in_block(i)]);
}

~bits_t(){
delete[] data;
}

};

#endif // LIB_BITS_T_H

(用g++4.7 -O3编译)

Eratosthenes 筛算法(33.333.333 位)

std::vector<bool> 19.1秒

bits_t<size_t> 19.9秒

bits_t<size_t> (with lookup table) 19.7秒

ctor + 调整大小(33.333.333 位)+ dtor

std::vector<bool> 120毫秒

bits_t<size_t> 150毫秒

问题:减速从何而来?

最佳答案

除了一些其他用户指出的所有问题之外,每次达到当前 block 限制以添加一个 block 时,您的调整大小都会分配更多内存。 std::vector 将使缓冲区的大小加倍(因此,如果您已经有 16 个 block ,那么现在有 32 个 block )。换句话说,他们做的新事会比你少。

话虽这么说,你没有做必要的删除和复制,这可能会对你的版本产生“积极”影响......(“积极”影响速度明智,你不删除旧的不是积极的数据,也不会将其复制到新缓冲区中。)

此外,std::vector 将适本地扩大缓冲区,从而复制可能已经在您的 CPU 缓存中的数据。对于您的版本,该缓存会丢失,因为您只是忽略了每个 resize() 的旧缓冲区。

此外,当一个类处理内存缓冲区时,出于某些原因,通常会实现复制和赋值运算符……您也可以考虑使用 shared_ptr<>()。然后隐藏删除,并且该类是一个模板,因此速度非常快(它不会添加您自己的版本中没有的任何代码。)

===更新

还有一件事。您是 operator [] 实现:

inline bit_ref operator[](const size_t i) noexcept{
return bit_ref(block_at_index(i), mask[index_in_block(i)]);
}

(旁注:不需要内联,因为您在类中编写代码这一事实已经意味着您已经同意内联功能。)

您只提供了一个“很慢”的非常量版本,因为它创建了一个子类。您应该尝试实现一个返回 bool 的 const 版本,看看这是否解释了您看到的 ~3% 的差异。

bool operator[](const size_t i) const noexcept
{
return (block_at_index(i) & mask[index_in_block(i)]) != 0;
}

此外,使用 mask[] 数组也可以减慢速度。 (1LL << (index & 0x3F)) 应该更快(2 条 CPU 指令,0 次内存访问)。

关于c++ - 为什么 std::vector<bool> 更快?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20150603/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com