c++ - 汇编/__asm 内联-6ren

c++ - 汇编/__asm 内联

转载作者：行者123 更新时间：2023-11-30 00:38:07

24

4

我正在学习汇编并在我的 Digital Mars C++ 编译器中进行一些内联。我搜索了一些东西来使程序更好，并使用这些参数来调整程序:

use better C++ compiler//thinking of GCC or intel compiler

use assembly only in critical part of program 

find better algorithm

Cache miss, cache contention.

Loop-carried dependency chain.

Instruction fetching time.

Instruction decoding time.

Instruction retirement.

Register read stalls.

Execution port throughput.

Execution unit throughput.

Suboptimal reordering and scheduling of micro-ops.

Branch misprediction.

Floating point exception.

除了“register read stalls”，我都听懂了。

问题:谁能告诉我这在 CPU 中是如何发生的以及“乱序执行”的“超标量”形式？正常的“乱序”似乎合乎逻辑，但我找不到“超标量”形式的合乎逻辑的解释。

问题 2:有人还可以给出一些 SSE SSE2 和较新 CPU 的良好指令列表，最好是微操作表、端口吞吐量、单元和一些延迟计算表，以找到一段代码的真正瓶颈？

我会很高兴有这样一个小例子:

//loop carried dependency chain breaking:
__asm
{
loop_begin:
....
.... 
sub edx,05h //rather than taking i*5 in each iteration, we sub 5 each iteration
sub ecx,01h //i-- counter
...
...
jnz loop_begin//edit: sub ecx must have been after the sub edx for jnz
}
//while sub edx makes us get rid of a multiplication also makes that independent of ecx, making independent

谢谢。

计算机:Pentium-M 2GHz，Windows XP-32 位

最佳答案

你应该看看 Agner Fogs 优化手册:Optimizing software in C++: An optimization guide for Windows, Linux and Mac platforms或 Optimizing subroutines in assembly language: An optimization guide for x86 platforms .

但要真正胜过现代编译器，您需要对要优化的架构有一些良好的背景知识:The microarchitecture of Intel, AMD and VIA CPUs: An optimization guide for assembly programmers and compiler makers

关于c++ - 汇编/__asm 内联，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/11701888/

24

4

0

文章推荐： c++ - OOP C++、虚函数和新运算符

文章推荐： c++ - 使用madeline2绘制谱系

文章推荐： mysql - 如何用左连接替换 group by？

文章推荐： php - 创建具有多个列和表变量的查询

c++ - 我可以从 __asm block 外部访问 __asm 变量吗？
我知道 __asm block 可以访问在这些 block 之外声明的变量，但我想做的是找到一种方法来访问在 __asm< 中声明的变量从这样的 block 外部 block 。我对 C 和 C++
C++ __asm 生成不同的字节
在我的函数中我使用 __asm { mov ecx,dword ptr [0x28F1431] mov ecx,ds:[0x28F14131] } 应该产生以下字节:0x8B0
c++ - 使用 __asm 时出现不正确的操作数类型错误
不正确的操作数类型是什么意思？我正在尝试将一些 C++ 代码转换为汇编程序 temp_char = OChar[i] //temp_char is a character
c++ - 汇编/__asm 内联
我正在学习汇编并在我的 Digital Mars C++ 编译器中进行一些内联。我搜索了一些东西来使程序更好，并使用这些参数来调整程序: use better C++ compiler//thin
c++ - 预期 `__asm` 语句中的表达式
我正在使用来自 this forum topic 的代码为了获取CPU的系列信息: #include struct cpuid_type { unsigned int eax; un
c++ - 使用 __asm 从十六进制偏移调用函数
我不懂汇编，所以我不确定该怎么做。我有一个程序正在挂接到另一个程序。我已经获得函数在 Hook 程序的 .exe 中所在位置的偏移量 #define FuncToCall 0x00447E5D 那么
c++ - 如何将内联汇编 __asm 转换为字节
好吧，我想绕过我的自定义注入(inject)函数，这样代码就可以工作了 void *DetourCreate(BYTE *src, const BYTE *dst) { int len = 5
c++ - 如何在 __asm 中使用变量？
我正在使用 VC 编译器编译此 C++ 代码。我正在尝试使用 __asm 语句调用一个采用两个 WORD(又名 unsigned short)参数的函数，如下所示: __declspec(naked)
c++ - 编译 __asm 代码所需的标志
使用内联汇编指令编译代码是否需要任何标志？我正在尝试让 g++ 编译以下代码(从 SO 上的答案克隆而来): #include using namespace std; inline unsign
c - 这个美元符号在 __asm 中是什么意思？
我试着用谷歌搜索这个，但找不到任何足以让我理解的信息。 int i; char msg1[] = "odd"; char msg2[] = "even"; char *ptr; __asm__("
C 和汇编 __asm 不工作
我找到这段代码把栈指针放入EAX寄存器(应该是C中return用的寄存器) #include unsigned long get_sp(){ unsigned long stp; _
c++ - 内联汇编 (__asm) block 是否会阻止功能优化？
当使用 Microsoft Visual C++(不是 CLI，只是标准的原生 C++)时，内联汇编是否会导致函数优化被禁用？当我使用 IDA 检查时， block 外的一些函数代码似乎确实发生了变
c - __asm{};返回 eax 的值？
简单的问题。 C 中的函数 asm 用于在您的代码中进行内联汇编。但是它返回什么？它是常规的 eax，如果不是，它返回什么？最佳答案 __asm__ 本身不返回值。 C 标准未定义 __asm__
c - gcc arm __asm inline 在参数中传递常量
我正在编写一个小型 cortex M0+ 引导加载程序。我在下面有一个内联程序集，它通过从 Flash 中的应用程序位置加载堆栈指针和重置处理程序来从引导加载程序启动主应用程序。 #define FL
c++ - GCC Windows __asm RDTSC 破坏者
这个问题已经有答案了: How to get the CPU cycle count in x86_64 from C++? (5 个回答) 已关闭 4 年前。所以我正在尝试在 Windows 的
c - GCC 的 MSVC __asm 关键字是什么？
我在 C 程序中用 asm 编写代码，我误读了 __asm{...} 是我需要的才能做到这一点。后来我发现 __asm 适用于 MSVC 编译器，而不是 GCC。 MSVC 的 __asm 关键字在
c - 将十六进制操作码解码为 asm 或在 __asm 中运行十六进制？
Visual Studio 2010(或 Windows 上的其他工具)是否有任何方法可以采用十六进制操作码指令，例如 EB D2 并“解码”或以其他方式将它们翻译成人类可读的 asm？或者，有没有
c++ - C++ 中的内联 asm 与 __asm
char name[25]; int generated_int; for(int i = 0; i> name; int nameLen = strlen(name); __asm { pu
c - 'asm' 、 '__asm' 和 '__asm__' 之间有什么区别？
据我所知，__asm { ... }; 之间的唯一区别和 __asm__("...");是第一次使用mov eax, var第二个使用 movl %0, %%eax与 :"=r" (var)在末尾。还
c++ - 错误 : Initializer provided for function, __THROW __asm
我正在尝试移植要使用 x86_64 C++ 编译的 ARM-C 库，但出现以下错误: In file included from /usr/include/c++/5/cwchar:44:0,

首页

博学

6Ren·AI

商城

c++ - 汇编/__asm 内联