c++ - Clang sizeof ("literal") 优化-6ren

c++ - Clang sizeof ("literal") 优化

转载作者：行者123 更新时间：2023-11-28 02:17:38

25

4

使用 C++ 的经验，我试图了解字符串文字的 sizeof 和 strlen 之间的性能差异。

这是我的小基准代码:

#include <iostream>
#include <cstring>

#define LOOP_COUNT 1000000000

unsigned long long rdtscl(void)
{
    unsigned int lo, hi;
    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
    return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}

int main()
{
    unsigned long long before = rdtscl();
    size_t ret;
    for (int i = 0; i < LOOP_COUNT; i++)
        ret = strlen("abcd");
    unsigned long long after = rdtscl();
    std::cout << "Strlen " << (after - before) << " ret=" << ret <<     std::endl;

    before = rdtscl();
    for (int i = 0; i < LOOP_COUNT; i++)
        ret = sizeof("abcd");
    after = rdtscl();
    std::cout << "Sizeof " << (after - before) << " ret=" << ret << std::endl;
}

用clang++编译，得到如下结果:

clang++ -O3 -Wall -o sizeof_vs_strlen sizeof_vs_strlen.cpp
./sizeof_vs_strlen

Strlen 36 ret=4
Sizeof 62092396 ret=5

使用g++:

g++ -O3 -Wall -o sizeof_vs_strlen sizeof_vs_strlen.cpp 
./sizeof_vs_strlen

Strlen 30 ret=4
Sizeof 30 ret=5

我强烈怀疑 g++ 确实优化了带有 sizeof 的循环，而 clang++ 则没有。这个结果是已知问题吗？

编辑:

由 clang++ 为带有 sizeof 的循环生成的程序集:

rdtsc  
mov    %edx,%r14d
shl    $0x20,%r14
mov    $0x3b9aca01,%ecx
xchg   %ax,%ax
add    $0xffffffed,%ecx // 0x400ad0
jne    0x400ad0 <main+192>
mov    %eax,%eax
or     %rax,%r14
rdtsc

还有 g++ 的那个:

rdtsc  
mov    %edx,%esi
mov    %eax,%ecx
rdtsc

我不明白为什么 clang++ 做 {add, jne} 循环，看起来没用。这是一个错误吗？

信息:

g++ (GCC) 5.1.0
clang version 3.6.2 (tags/RELEASE_362/final)

编辑2:它可能是 clang 中的错误。我开了一个bug report .

最佳答案

我会称之为 clang 中的错误。

它实际上是在优化 sizeof 本身，而不是循环。

为了使代码更清晰，我将 std::cout 更改为 printf，然后您将获得以下 main 的 LLVM-IR 代码:

; Function Attrs: nounwind uwtable
define i32 @main() #0 {
entry:
  %0 = tail call { i32, i32 } asm sideeffect "rdtsc", "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !1
  %asmresult1.i = extractvalue { i32, i32 } %0, 1
  %conv2.i = zext i32 %asmresult1.i to i64
  %shl.i = shl nuw i64 %conv2.i, 32
  %asmresult.i = extractvalue { i32, i32 } %0, 0
  %conv.i = zext i32 %asmresult.i to i64
  %or.i = or i64 %shl.i, %conv.i
  %1 = tail call { i32, i32 } asm sideeffect "rdtsc", "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !1
  %asmresult.i.25 = extractvalue { i32, i32 } %1, 0
  %asmresult1.i.26 = extractvalue { i32, i32 } %1, 1
  %conv.i.27 = zext i32 %asmresult.i.25 to i64
  %conv2.i.28 = zext i32 %asmresult1.i.26 to i64
  %shl.i.29 = shl nuw i64 %conv2.i.28, 32
  %or.i.30 = or i64 %shl.i.29, %conv.i.27
  %sub = sub i64 %or.i.30, %or.i
  %call2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([21 x i8], [21 x i8]* @.str, i64 0, i64 0), i64 %sub, i64 4)
  %2 = tail call { i32, i32 } asm sideeffect "rdtsc", "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !1
  %asmresult1.i.32 = extractvalue { i32, i32 } %2, 1
  %conv2.i.34 = zext i32 %asmresult1.i.32 to i64
  %shl.i.35 = shl nuw i64 %conv2.i.34, 32
  br label %for.cond.5

for.cond.5:                                       ; preds = %for.cond.5, %entry
  %i4.0 = phi i32 [ 0, %entry ], [ %inc10.18, %for.cond.5 ]
  %inc10.18 = add nsw i32 %i4.0, 19
  %exitcond.18 = icmp eq i32 %inc10.18, 1000000001
  br i1 %exitcond.18, label %for.cond.cleanup.7, label %for.cond.5

for.cond.cleanup.7:                               ; preds = %for.cond.5
  %asmresult.i.31 = extractvalue { i32, i32 } %2, 0
  %conv.i.33 = zext i32 %asmresult.i.31 to i64
  %or.i.36 = or i64 %shl.i.35, %conv.i.33
  %3 = tail call { i32, i32 } asm sideeffect "rdtsc", "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !1
  %asmresult.i.37 = extractvalue { i32, i32 } %3, 0
  %asmresult1.i.38 = extractvalue { i32, i32 } %3, 1
  %conv.i.39 = zext i32 %asmresult.i.37 to i64
  %conv2.i.40 = zext i32 %asmresult1.i.38 to i64
  %shl.i.41 = shl nuw i64 %conv2.i.40, 32
  %or.i.42 = or i64 %shl.i.41, %conv.i.39
  %sub13 = sub i64 %or.i.42, %or.i.36
  %call14 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([21 x i8], [21 x i8]* @.str, i64 0, i64 0), i64 %sub13, i64 5)
  ret i32 0
}

如您所见，对 printf 的调用使用了 sizeof 中的常量 5，而 for.cond.5: 开始空循环: 一个“phi”节点(它根据我们来自的位置选择 i 的"new"值 - 在循环之前 -> 0，在循环内 -> % inc10.18) 增量如果 %inc10.18 不是 100000001，则跳回的条件分支。

我对 clang 和 LLVM 的了解还不够，无法解释为什么没有优化那个空循环。但肯定不是 sizeof 需要时间，因为循环内没有 sizeof。

值得注意的是，sizeof 在编译时始终是一个常量，除了将常量值加载到寄存器之外，它永远不会“花费时间”。

关于c++ - Clang sizeof ("literal") 优化，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33562365/

25

4

0

文章推荐： javascript - 构建 mailto 链接时对 URI 组件进行编码

文章推荐： iphone - iPhone 上的 Safari : CSS Positioning off by 1px

文章推荐： javascript - 如何使 JSTL 编写的元素的 id 可用于脚本编写？

文章推荐： html - css 星级评定 html

javascript - 为什么 (function() { return this; }).call ('string literal' ) 返回 [String : 'string literal' ] instead of 'string literal' ?
这是我在试验 JS 时的最新发现: (function() { return this; }).call('string literal'); // => [String: 'string liter
c - 宏 : string literal from char literal
在 C 中有没有一种方法可以使用宏从字 rune 字创建字符串文字？比如我有 'a' 我想创建字符串文字 "a" 澄清问题: #define A 'a' write(fd, "x=" CHAR2ST
objective-c - 为什么使用 'literal syntax' 创建的实例被称为 'literals' ？
令我困扰的是为什么术语“literal”被用来指代NSString和NSArray等类的实例。我只看到了用于引用 NSString 的术语，并且天真地认为它与“字面上”是一个字符串(位于引号之间)有关
php - 如何使 smarty 变量在 {literal}{/literal} 下工作
我有以下使用 smarty 模板引擎的代码在 php 文件中: $smarty->assign('SITE_URL', 'http://localhost/mis/'); 在 tpl 文件中:
regex - "Match a literal character"还是 "match a character literally"？
我正在使用 regex101 tool 制作正则表达式并在解释字段中阅读 [.] - 文字字符 .[\.] - 匹配字符 .字面上地我在“字面字符”和“字面字符”之间迷失了方向。这两者有什么区别？
c++ - 了解 C++ - "character literal"与 "string literal"
这个问题在这里已经有了答案: Single quotes vs. double quotes in C or C++ (15 个答案) 关闭 4 年前。我正在阅读一本讨论“字 rune 字”与“字
c++ - literal double 到 float 的转换是否等于 float literal？
浮点文字 float x = 3.2f; 的直接赋值和隐式转换为 float 的 double 之间的位表示是否存在差异 float x2 = 3.2;? 即是 #define EQUAL(FLOAT
rust - 打印!错误 : expected a literal/format argument must be a string literal
这个极其简单的 Rust 程序: fn main() { let c = "hello"; println!(c); } 抛出以下编译时错误: error: expected a li
rust - 打印!错误 : expected a literal/format argument must be a string literal
这个极其简单的 Rust 程序: fn main() { let c = "hello"; println!(c); } 抛出以下编译时错误: error: expected a li
javascript - 如何获得: Array of object literals with help of an template from a bigger data set of object literals?
是否有单行，类型 mapfilter ... es5+ higherfunction 或 ...？希望解决方案描述如何以及为什么和引用。如果有人知道为什么曾经不起作用。 timeTableKeys:是
c++ - 使用 "expected a string literal, but found a user-defined string literal instead"C"'时是什么导致错误 'extern "？
我有包含这些声明的代码: class IRealNetDll; extern "C"__declspec(dllexport) IRealNetDll * CreateRealInstance(int
web-services - 是否有任何版本的 Delphi(不是 .net)支持 Document/literal 或 RPC/literal 绑定(bind) SOAP 服务？
我有一个用于我们的一些服务的 SOAP 服务器 API，其中一位客户说他们不会与不提供至少 WS-I 基本配置文件合规性绑定(bind)的服务集成。那么，问题是任何版本的 Delphi 都支持 Do
Python List -- ValueError: invalid literal for int() with base 10: ' ' [duplicate](Python List -- ValueError：invalid literal for int（）with base 10：' ' [duplicate])
我已经尝试使用这两个循环以及列表理解。即使我正在尝试将数字转换为列表中的整型，两者都无法解析整数。
literals - 如何在代码中隐藏文字
在代码中隐藏文字值的主要现有方法是什么，以便仅使用 hexdumper 或反编译器不容易跟踪它们？例如，而不是编码: static final int MY_VALUE = 100; 我们可
F# [] 导致程序无效
我正在尝试使用文字来匹配 Empty Guid，但我不知道这里发生了什么: let [] EmptyGuid = System.Guid () let someFunction () = System
excel - 搜索并用通配符替换为 LITERALS
我正在尝试搜索 [Panels] like "*,*" 在工作表中，不幸的是，它没有准确检索该字符串，而是检索如下内容: [Panels] like "*blah,*" [Panels] like "
literate-programming - 自然语言编程与文学编程
我看不出自然语言编程和文学编程之间的区别。如果有人解释，我将不胜感激。最佳答案 Natural language programming是一种以近似于人类书写或说话的语言的形式向计算机表达指令的
python - 从字符串数组初始化pydantic Literal
我想从字符串数组中初始化一个 pydantic 文字 from typing import Literal from pydantic import BaseModel CLASS_NAME_VALU
Javascript 在代码后面使用 Literal
我正在尝试在代码后面编写我的js代码。我定义了一个名为“contentPie”的文字，然后将 javascript 代码放入其文本中。但这不起作用。 contentPie.Text = "window
literals - PyParsing:是否可以全局抑制所有文字？
我有一个简单的数据集，可以使用如下行进行解析: R1 (a/30) to R2 (b/30), metric 30 我需要从上面得到的唯一数据如下: R1, a, 30, R2, 192.168.0.

首页

博学

6Ren·AI

商城

c++ - Clang sizeof ("literal") 优化