gpt4 book ai didi

optimization - 帧指针优化的使用

转载 作者:行者123 更新时间:2023-12-03 05:03:12 32 4
gpt4 key购买 nike

这与 frame pointer omitting ? Any risk? 相关但不同

我正在尝试遵循这篇旧的(但仍然相关的文章)

http://blogs.msdn.com/b/larryosterman/archive/2007/03/12/fpo.aspx

拉里(作者写道)

machines got sufficiently faster since 1995 that the performance improvements that were achieved by FPO weren't sufficient to counter the pain in debugging and analysis that FPO caused

但是,在页面下方的讨论中,一位用户写道

Disabling FPO can have both serious code size and performance impact. Tail call optimizations have to be disabled when a frame pointer is present, leading to much greater stack usage in affected paths. Small functions are also disproportionately affected by prolog/epilog code. Third, although there are still six registers available with a frame pointer on X86, only three of them are nonvolatile with respect to nested calls: EBX, ESI, and EDI. Opening up a fourth register can drop out a bunch of spill code.

我有几个问题。

  1. 溢出代码 == 注册溢出?
  2. 作者是否正确,FPO 通常被认为是一种痛苦和痛苦? yield 不会超过好处。
  3. 如今,FPO 在 x64 架构中是否仍然具有相关性,因为有还有更多寄存器可供使用。
  4. 您使用 FPO 吗?有何用途(如果是的话)以及它有什么不同吗你?

本文最后

http://www.altdevblogaday.com/2012/05/24/x64-abi-intro-to-the-windows-x64-calling-convention/

作者说

[with repect to Windows x64 calling convention].....

All parameters have space reserved on the stack, even the ones passed in registers. In fact, there’s stack space for 4 parameters even if your function doesn’t have any params. Those parameters are 8 bytes so that’s at least 32 bytes on the stack for every function (every function actually has at least 48 bytes on the stack…I’ll explain that another time). This stack area is called the home space. There are few reasons behind this home space:

  1. If the registers need to be used for something else, the called function can store the data in the home space without moving the stack pointer.
  2. It keeps the stack structure easy to determine. That’s very handy for debugging, and perhaps necessary for x64′s stack metadata (another point I’ll come back to another time). ...... The compiler can use it for whatever it wants, and an optimized build will likely make great use of it.

优化的构建不会优化多余的分配吗?

最佳答案

1.Spill code == Register spillage?

差不多了。严格来说,溢出代码是编译器为了实现寄存器溢出而添加的代码。溢出本身就是将生命范围标记为无法放入寄存器的决定。

2.Is the author correct that FPO is generally considered a pain and the gain doe not out-weigh the benefits.

作者可能是正确的,在现代处理器架构中,FPO 将产生显着性能增益的功能类型比过去更小。然而,FPO 的确实使代码更小,从而减少了缓存压力。它们确实减少了套准压力。这些在某些设置中可能很重要。他们确实通过一些指令来加速序言和结尾代码。值得注意的是,如果没有 FP,调试器就无法正常工作。这意味着核心转储对于生产优化代码的事后分析不太有用。除了最终测试之外,您绝不会希望在开发过程中使用 FPO。

3.Is FPO still relevant today in x64 architecture since there are a LOT more registers o play with.

现代处理器是如此多样化和复杂,以至于您在尝试和测量之前几乎永远不知道什么是“相关的”。

4.Do you use FPO? What for (if yes) and does it make a difference to you?

我编写了一个中等大小的 C 库 (20K SLOC),它在 gcc 下的总体运行时间上产生了很小的差异 (~5%)。这是脚本语言的 native 语言扩展,必须在 gcc 和 Visual C 下进行编译。使用它会分割构建路径。我认为 5% 对于扩展所服务的目的来说是不值得的。但如果是通过动态流体模拟来预测天气,那么 5% 可能值(value)数百万美元。决定将会有所不同。

5.Wouldn't an optimized build optimize the excess allocation away?

这完全取决于编译器和优化器设计者。从MS文档看here MS 已将 ABI 定义为所有数据需要主空间,即使它的整个生命周期都花在寄存器中。

关于optimization - 帧指针优化的使用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23528816/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com