c++ - c++中std::min(int)的效率-6ren

c++ - c++中std::min(int)的效率

转载作者：塔克拉玛干更新时间：2023-11-03 00:58:17

24

4

我的代码中有一个迭代 1 亿次的循环(需要模拟模型的 1 亿次复制)。对于 1 亿次迭代中的每一次，我通过索引名为 age 的整数变量从数组 (myarray) 中检索一个值。由于数组的长度，只对 age=0,...,99 索引 myarray[age] 有效。但是，age 的实际域是 0,...,inf。

所以，我有以下功能

int tidx(const int& a) {
    return std::min(a,99);
}

允许通过 myarray[tidx(age)] 进行索引。

我怎样才能更有效地做到这一点？

[性能输出低于]

构建说明我正在使用的编译器标志的源文件的示例:

Building file: ../SAR.cpp
Invoking: GCC C++ Compiler
g++ -O3 -Wall -c -fmessage-length=0 -Wno-sign-compare -fopenmp -MMD -MP -MF"SAR.d" -MT"SAR.d" -o"SAR.o" "../SAR.cpp"
Finished building: ../SAR.cpp

从 perf record 到 perf report:

Samples: 280  of event 'cycles', Event count (approx.): 179855989                                                                                   
 24.78%  pc2  libc-2.17.so         [.] __GI_____strtod_l_internal
 11.35%  pc2  pc2                  [.] samplePSA(int, double, int, NRRan&)
  6.67%  pc2  libc-2.17.so         [.] str_to_mpn.isra.0
  6.15%  pc2  pc2                  [.] simulate4_NEJMdisutilities(Policy&, bool)
  5.68%  pc2  pc2                  [.] (anonymous namespace)::stateTransition(double const&, int const&, int&, double const&, bool const&, bool&, bo
  5.25%  pc2  pc2                  [.] HistogramAges::add(double const&)
  3.73%  pc2  libstdc++.so.6.0.17  [.] std::istream::getline(char*, long, char)
  3.02%  pc2  libstdc++.so.6.0.17  [.] std::basic_istream<char, std::char_traits<char> >& std::operator>><char, std::char_traits<char> >(std::basic_
  2.49%  pc2  [kernel.kallsyms]    [k] 0xffffffff81043e6a
  2.29%  pc2  libc-2.17.so         [.] __strlen_sse2
  2.00%  pc2  libc-2.17.so         [.] __mpn_lshift
  1.72%  pc2  libstdc++.so.6.0.17  [.] __cxxabiv1::__vmi_class_type_info::__do_dyncast(long, __cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::
  1.71%  pc2  libc-2.17.so         [.] __memcpy_ssse3_back
  1.67%  pc2  libstdc++.so.6.0.17  [.] std::locale::~locale()
  1.65%  pc2  libc-2.17.so         [.] __mpn_construct_double
  1.38%  pc2  libc-2.17.so         [.] memchr
  1.29%  pc2  pc2                  [.] (anonymous namespace)::readTransitionMatrix(double*, std::string)
  1.27%  pc2  libstdc++.so.6.0.17  [.] std::string::_M_mutate(unsigned long, unsigned long, unsigned long)
  1.15%  pc2  libc-2.17.so         [.] round_and_return
  1.02%  pc2  libc-2.17.so         [.] __mpn_mul
  1.01%  pc2  libstdc++.so.6.0.17  [.] std::istream::sentry::sentry(std::istream&, bool)
  1.00%  pc2  libc-2.17.so         [.] __memcpy_sse2
  0.85%  pc2  libstdc++.so.6.0.17  [.] std::locale::locale(std::locale const&)
  0.85%  pc2  libstdc++.so.6.0.17  [.] std::string::_M_replace_safe(unsigned long, unsigned long, char const*, unsigned long)
  0.83%  pc2  libstdc++.so.6.0.17  [.] std::locale::locale()
  0.73%  pc2  libc-2.17.so         [.] __mpn_mul_1

来自性能统计:

 Performance counter stats for './release/pc2':

         62.449034 task-clock                #    0.988 CPUs utilized          
                49 context-switches          #    0.785 K/sec                  
                 3 cpu-migrations            #    0.048 K/sec                  
               861 page-faults               #    0.014 M/sec                  
       179,240,478 cycles                    #    2.870 GHz                    
        58,909,298 stalled-cycles-frontend   #   32.87% frontend cycles idle   
   <not supported> stalled-cycles-backend  
       320,437,960 instructions              #    1.79  insns per cycle        
                                             #    0.18  stalled cycles per insn
        70,932,710 branches                  # 1135.850 M/sec                  
           697,468 branch-misses             #    0.98% of all branches        

       0.063228446 seconds time elapsed

如有任何意见，我将不胜感激。我需要学习如何解释/阅读这些信息，所以任何可能帮助我入门的提示都将不胜感激。

最佳答案

要优化代码首先要弄清楚什么地方是瓶颈。要找到瓶颈，必须分析代码。否则，变化是大量时间将浪费在根本无关紧要的微优化/错误优化上。

我没有在你的最小工作代码示例(你没有提供)中使用分析器，但根据我的经验我可以告诉你这一点——你的 tidx() 函数不是一个瓶颈，在这种情况下你不应该关心 std::min() 的性能。瓶颈更有可能是内存访问和停滞的 CPU 周期。

对于初学者，如果可能(如果编译器没有为您完成)尝试展开您的循环。执行 25000000 次迭代可能比 100000000 次更有效，但在单个循环周期中执行更多操作。但在你这样做之前，你必须确保展开循环有帮助而不是伤害。这通常是通过分析来完成的，所以我们回到要优化代码的地步，首先必须弄清楚瓶颈在哪里。找到一个瓶颈……哦，等等，我在这里差点陷入无限循环。中止。

关于c++ - c++中std::min(int)的效率，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/16740879/

24

4

0

文章推荐： java - AWT 对话框放置

文章推荐： android - Kotlin + Espresso : No activities found

文章推荐： c - LD_PRELOAD 从根本上降低了 linux 下 C 的进程创建性能？

Scala: (Int, Int) => Int 不匹配 (Int, Int) => Int
我正在尝试使用 y 组合器在 Scala 中定义 gcd: object Main { def y[A,B]( f : (A => B) => A => B ) : A => B = f(y(f)
c++ - 无法将 int (*(int))(int) 转换为 int (*(int))(int)
我正在尝试了解返回指向函数的指针的函数，在我尝试编译代码后，它给了我这种错误: cannot convert int (*(int))(int) to int (*(int))(int) in ass
java - BufferedImage.getRGB(int, int, int, int, int[], int, int) 如何工作？
所以我一直在关注 youtube 上的游戏编程教程，然后弹出了这段代码:bufferedImageObject.getRGB(int, int, int, int, int[], int, int);
c# - 将格式化的日期字符串转换为 DateTime(int,int,int,int,int,int) 以传递给函数
我正在将时间现在与存储在数据库某处的时间进行比较。数据库中存储的时间格式为“yyyyMMddHHmmss”。例如，数据库可能会为存储的时间值返回 201106203354。然后我使用一个函数将时间现
java - 如何以这种格式编写java模式 : any characters (int, int) (int,int) number number any number of (int,int,int)
例如 Maze0.bmp (0,0) (319,239) 65 120 Maze0.bmp (0,0) (319,239) 65 120 (254,243,90) Maze0.bmp (0,0) (
haskell - 理解类型错误 : "expected signature Int*Int->Int but got Int*Int->Int"
评论 Steve Yegge的post关于 server-side Javascript开始讨论语言中类型系统的优点和这个 comment描述: ... examples from H-M style
c - int(*function)(int,int) 和 int*function(int,int) 的区别
我正在研究 C 的指针，从 Deitel 的书中我不明白 int(*function)(int,int) 和 int*function(int, int) 表示函数时。最佳答案 C 中读取类型的经验
java - joda new DateTime(int，int，int，int，int，int)的问题
您好，我使用 weblogic 11g 创建 war 应用程序，我对 joda time 的方法有疑问 new DateTime(int, int, int, int, int, int); 这抛出了
java - 方法 sum(int, int, int, int) 不适用于参数 (int)
Create a method called average that calculates the average of the numbers passed as parameters. The
swift - 二元运算符 "=="不能应用于 (Int, Int, Int, Int) -> Int 类型的操作数
var a11: Int = 0 var a12: Int = 0 var a21: Int = 0 var a22: Int = 0 var valueDeterminant = a11 * a12
c++ - 阿杜伊诺错误 : too few arguments to function 'int getMode(int, int, int, int, int)'
我正在为一个项目设置 LED 阵列。我得到了一个 LED 阵列，可以根据引脚变化电压进行更改，但我无法添加更多引脚。当我尝试时，编译失败并显示错误:函数“int getMode(int, int,
haskell - 创建 Int 和函数列表 Int -> Int -> Int
除了创建对列表执行简单操作的函数之外，我对 haskell 还是很陌生。我想创建一个列表，其中包含 Int 类型的内容, 和 Int -> Int -> Int 类型的函数. 这是我尝试过的: dat
Java-高效地执行 .setBounds(int, int, int, int);
这个问题已经有答案了: Java add buttons dynamically as an array [duplicate] (4 个回答) 已关闭 7 年前。 StackOverFlow问题今天
android - setCompoundDrawablesWithIntrinsicBounds(int，int，int，int)不起作用
我有几个 EditText View ，我想在其中设置左侧的图像，而 setCompoundDrawablesWithIntrinsicBounds 似乎不起作用。图形似乎没有改变。有人知道为什么会
c++ - 为什么 `is_constructible, int(*)(int,int)>::value`在VC2015RC下为true
#include using namespace std; int main() { static_assert(is_constructible, int(*)(int,int)>::val
java - Kotlin:用 Pair 调用 (Int, Int) -> Int 的惯用方式？
fun sum(a: Int, b: Int) = a + b val x = 1.to(2) 我在找: sum.tupled(x)，或者 sum(*x) 当然，以上都不能用 Kotlin 1.1.3
ios - 类型 "Int -> Bool","Int-> Bool -> Int","Int-> String -> Int－> Bool"
有一个函数: func (first: Int) -> Int -> Bool -> String { return ? } 返回值怎么写？我对上面 func 的返回类型感到很困惑。最
ocaml - OCaml 求和类型中的 int * int 与 (int * int)
type foo = A of int * int | B of (int * int) int * int 和 (int * int) 有什么区别？我看到的唯一区别在于模式匹配: let test_
java - 找不到符号方法drawImage(SlidingBlockModel, int, int, int, int, )
我正在尝试制作一个 slider 游戏。在这个类中，我使用 Graphics 对象 g2 的 drawImage 方法来显示“拼图”的 block 。但在绘制类方法中，我收到此错误:找不到符号方法dr
c# - int int.operator(int left, int right) &
我试着理解这个表达: static Func isOdd = i => (i & 1) == 1; 但是这是什么意思呢？例如我有 i = 3。然后 (3 & 1) == 1 或 i = 4。然后

首页

博学

6Ren·AI

商城

c++ - c++中std::min(int)的效率