gpt4 book ai didi

C++ 显式模板定义 - 代码仍然重复

转载 作者:搜寻专家 更新时间:2023-10-31 01:29:17 24 4
gpt4 key购买 nike

我有一些模板代码可以实现相当繁重的计算,但我只需要它来处理 float 和 double 。目标是模板实例化只在一个编译单元中完成一次,而不是为每个文件重复。

我尝试遵循以下 Stackoverflow 帖子中的想法:

和类似的重复问题。我想出了以下测试来说明这个问题:

啊啊

#pragma once
#include <cmath>
template<typename T>
struct A
{
static T foo(T a, T b)
{
//do some heavy computations
T v1 = pow(a, b);
return pow(v1, b);
}
};

//explicit template instantiations, the declaration
extern template struct A<float>;
extern template struct A<double>;

A.cpp

#include "A.h"
//explicit template instantiations, the definition
template struct A<float>;
template struct A<double>;

main.cpp

#include "A.h"
int main()
{
//use A
float result = A<float>::foo(0, 0);
return (int)result; //return it so that it doesn't get optimized away
}

当我现在查看生成的 .obj 文件 (dumpbin/DISASM) 时,我得到以下输出:

A.obj

Dump of file A.obj

File Type: COFF OBJECT

?foo@?$A@M@@SAMMM@Z (public: static float __cdecl A<float>::foo(float,float)):
0000000000000000: F3 0F 11 4C 24 10 movss dword ptr [rsp+10h],xmm1
0000000000000006: F3 0F 11 44 24 08 movss dword ptr [rsp+8],xmm0
000000000000000C: 55 push rbp
000000000000000D: 57 push rdi
000000000000000E: 48 81 EC 18 01 00 sub rsp,118h
00
0000000000000015: 48 8D 6C 24 30 lea rbp,[rsp+30h]
000000000000001A: 48 8B FC mov rdi,rsp
000000000000001D: B9 46 00 00 00 mov ecx,46h
0000000000000022: B8 CC CC CC CC mov eax,0CCCCCCCCh
0000000000000027: F3 AB rep stos dword ptr [rdi]
0000000000000029: F3 0F 10 8D 08 01 movss xmm1,dword ptr [rbp+108h]
00 00
0000000000000031: F3 0F 10 85 00 01 movss xmm0,dword ptr [rbp+100h]
00 00
0000000000000039: E8 00 00 00 00 call ?pow@@YAMMM@Z
000000000000003E: F3 0F 11 45 04 movss dword ptr [rbp+4],xmm0
0000000000000043: F3 0F 10 8D 08 01 movss xmm1,dword ptr [rbp+108h]
00 00
000000000000004B: F3 0F 10 45 04 movss xmm0,dword ptr [rbp+4]
0000000000000050: E8 00 00 00 00 call ?pow@@YAMMM@Z
0000000000000055: 48 8D A5 E8 00 00 lea rsp,[rbp+0E8h]
00
000000000000005C: 5F pop rdi
000000000000005D: 5D pop rbp
000000000000005E: C3 ret

?foo@?$A@N@@SANNN@Z (public: static double __cdecl A<double>::foo(double,double)):
0000000000000000: F2 0F 11 4C 24 10 movsd mmword ptr [rsp+10h],xmm1
0000000000000006: F2 0F 11 44 24 08 movsd mmword ptr [rsp+8],xmm0
000000000000000C: 55 push rbp
000000000000000D: 57 push rdi
000000000000000E: 48 81 EC 18 01 00 sub rsp,118h
00
0000000000000015: 48 8D 6C 24 30 lea rbp,[rsp+30h]
000000000000001A: 48 8B FC mov rdi,rsp
000000000000001D: B9 46 00 00 00 mov ecx,46h
0000000000000022: B8 CC CC CC CC mov eax,0CCCCCCCCh
0000000000000027: F3 AB rep stos dword ptr [rdi]
0000000000000029: F2 0F 10 8D 08 01 movsd xmm1,mmword ptr [rbp+108h]
00 00
0000000000000031: F2 0F 10 85 00 01 movsd xmm0,mmword ptr [rbp+100h]
00 00
0000000000000039: E8 00 00 00 00 call pow
000000000000003E: F2 0F 11 45 08 movsd mmword ptr [rbp+8],xmm0
0000000000000043: F2 0F 10 8D 08 01 movsd xmm1,mmword ptr [rbp+108h]
00 00
000000000000004B: F2 0F 10 45 08 movsd xmm0,mmword ptr [rbp+8]
0000000000000050: E8 00 00 00 00 call pow
0000000000000055: 48 8D A5 E8 00 00 lea rsp,[rbp+0E8h]
00
000000000000005C: 5F pop rdi
000000000000005D: 5D pop rbp
000000000000005E: C3 ret
....

主对象

Dump of file Main.obj

File Type: COFF OBJECT

?foo@?$A@M@@SAMMM@Z (public: static float __cdecl A<float>::foo(float,float)):
0000000000000000: F3 0F 11 4C 24 10 movss dword ptr [rsp+10h],xmm1
0000000000000006: F3 0F 11 44 24 08 movss dword ptr [rsp+8],xmm0
000000000000000C: 55 push rbp
000000000000000D: 57 push rdi
000000000000000E: 48 81 EC 18 01 00 sub rsp,118h
00
0000000000000015: 48 8D 6C 24 30 lea rbp,[rsp+30h]
000000000000001A: 48 8B FC mov rdi,rsp
000000000000001D: B9 46 00 00 00 mov ecx,46h
0000000000000022: B8 CC CC CC CC mov eax,0CCCCCCCCh
0000000000000027: F3 AB rep stos dword ptr [rdi]
0000000000000029: F3 0F 10 8D 08 01 movss xmm1,dword ptr [rbp+108h]
00 00
0000000000000031: F3 0F 10 85 00 01 movss xmm0,dword ptr [rbp+100h]
00 00
0000000000000039: E8 00 00 00 00 call ?pow@@YAMMM@Z
000000000000003E: F3 0F 11 45 04 movss dword ptr [rbp+4],xmm0
0000000000000043: F3 0F 10 8D 08 01 movss xmm1,dword ptr [rbp+108h]
00 00
000000000000004B: F3 0F 10 45 04 movss xmm0,dword ptr [rbp+4]
0000000000000050: E8 00 00 00 00 call ?pow@@YAMMM@Z
0000000000000055: 48 8D A5 E8 00 00 lea rsp,[rbp+0E8h]
00
000000000000005C: 5F pop rdi
000000000000005D: 5D pop rbp
000000000000005E: C3 ret
....

A::foo 按预期在 A.obj 中实例化。但是代码也再次放入 Main.obj 中,完全忽略了 extern 关键字。

我如何告诉编译器(Visual Studio 2017,Release 模式)不要内联方法,而是使用 A.obj 中的版本?

最佳答案

你可以用 __declspec(noinline) 做到这一点.

但内联版本可能会更快。如果您担心二进制大小,您的 .exe 文件将只有该函数的一个实例。来自 A.obj 的代码未被使用,将在死代码消除步骤中被链接器丢弃。

更新:将此放入您的 A.h:

static __declspec( noinline ) T foo( T a, T b )
{
//do some heavy computations
T v1 = pow( a, b );
return pow( v1, b );
}

我已经使用 Visual C++ 2017 15.6.7,版本 32 和 64 位构建,对于这两个平台 Main.cpp 编译为:

; Line 5
call ?foo@?$A@M@@SAMMM@Z ; A<float>::foo
; Line 6
cvttss2si eax, xmm0

但是,如果您这样做是为了减少编译时间,我不确定 noinline 是否有帮助。相反,从 A.h 中删除函数体(离开声明),将其移动到 A.cpp 中。理想情况下,还从 A.h 中删除特征 header (或保留定义数据结构的最低限度),并将特征 header 包含到 A.cpp 中。

关于C++ 显式模板定义 - 代码仍然重复,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50236475/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com