gpt4 book ai didi

c++ - GDB 调试缺少特定调用堆栈符号表的核心转储

转载 作者:太空宇宙 更新时间:2023-11-04 09:21:00 25 4
gpt4 key购买 nike

我遇到了这个奇怪的崩溃,我不知道如何调试核心转储,因为调用堆栈由于某种原因缺少符号信息,除了最后一个函数:

#0  BIH::intersectRay<VMAP::MapRayCallback> (this=0x7f47b8339608, r=..., intersectCallback=..., maxDist=@0x7f493af8383c: 0, stopAtFirst=true, los=<optimized out>) at ../BIH.h:223
#1 0x000000307ff00000 in ?? ()
#2 0x7ff0000000000000 in ?? ()
#3 0x0000000000000030 in ?? ()
#4 0x000000307ff00000 in ?? ()
#5 0x7ff0000000000000 in ?? ()
#6 0x0000000000000030 in ?? ()
#7 0x000000307ff00000 in ?? ()
#8 0x7ff0000000000000 in ?? ()
#9 0x0000000000000030 in ?? ()
#10 0x000000307ff00000 in ?? ()
#11 0x7ff0000000000000 in ?? ()
#12 0x0000000000000030 in ?? ()
#13 0x000000307ff00000 in ?? ()
#14 0x7ff0000000000000 in ?? ()
#15 0x0000000000000030 in ?? ()
#16 0x000000307ff00000 in ?? ()
#17 0x7ff0000000000000 in ?? ()
#18 0x0000000000000030 in ?? ()
#19 0x000000307ff00000 in ?? ()
#20 0x7ff0000000000000 in ?? ()
#21 0x0000000000000030 in ?? ()
#22 0x000000307ff00000 in ?? ()
....
#749 0x7ff0000000000000 in ?? ()
#750 0x0000000000000030 in ?? ()
#751 0x000000307ff00000 in ?? ()
#752 0x7ff0000000000000 in ?? ()
#753 0x0000000000000030 in ?? ()
#754 0x000000307ff00000 in ?? ()
#755 0x7ff0000000000000 in ?? ()
#756 0x0000000000000030 in ?? ()
#757 0x000000307ff00000 in ?? ()
#758 0x7ff0000000000000 in ?? ()
#759 0x0000000000000030 in ?? ()
#760 0x000000307ff00000 in ?? ()
#761 0x7ff0000000000000 in ?? ()
#762 0x0000000000000030 in ?? ()
#763 0x000000307ff00000 in ?? ()
#764 0x03010102464c457f in ?? ()
#765 0x0000000000000000 in ?? ()`


(gdb) info frame 0
Stack frame at 0x7f493af83830:
rip = 0x930f0b in BIH::intersectRay<VMAP::MapRayCallback> (../BIH.h:223); saved rip = 0x307ff00000
called by frame at 0x7f493af83838
source language c++.
Arglist at 0x7f493af83438, args: this=0x7f47b8339608, r=..., intersectCallback=..., maxDist=@0x7f493af8383c: 0, stopAtFirst=true, los=<optimized out>
Locals at 0x7f493af83438, Previous frame's sp is 0x7f493af83830
Saved registers:
rbx at 0x7f493af837f8, rbp at 0x7f493af83800, r12 at 0x7f493af83808, r13 at 0x7f493af83810, r14 at 0x7f493af83818, r15 at 0x7f493af83820, rip at 0x7f493af83828

#1 0x000000307ff00000 in ?? ()
No symbol table info available.
(gdb) info frame 1
Stack frame at 0x7f493af83838:
rip = 0x307ff00000; saved rip = 0x7ff0000000000000
called by frame at 0x7f493af83840, caller of frame at 0x7f493af83830
Arglist at 0x7f493af83828, args:
Locals at 0x7f493af83828, Previous frame's sp is 0x7f493af83838
Saved registers:
rip at 0x7f493af83830

#2 0x7ff0000000000000 in ?? ()
No symbol table info available.
(gdb) info frame 2
Stack frame at 0x7f493af83840:
rip = 0x7ff0000000000000; saved rip = 0x30
called by frame at 0x7f493af83848, caller of frame at 0x7f493af83838
Arglist at 0x7f493af83830, args:
Locals at 0x7f493af83830, Previous frame's sp is 0x7f493af83840
Saved registers:
rip at 0x7f493af83838

#3 0x0000000000000030 in ?? ()
No symbol table info available.
(gdb) info frame 3
Stack frame at 0x7f493af83848:
rip = 0x30; saved rip = 0x307ff00000
called by frame at 0x7f493af83850, caller of frame at 0x7f493af83840
Arglist at 0x7f493af83838, args:
Locals at 0x7f493af83838, Previous frame's sp is 0x7f493af83848
Saved registers:
rip at 0x7f493af83840

#4 0x000000307ff00000 in ?? ()
No symbol table info available.
(gdb) info frame 4
Stack frame at 0x7f493af83850:
rip = 0x307ff00000; saved rip = 0x7ff0000000000000
called by frame at 0x7f493af83858, caller of frame at 0x7f493af83848
Arglist at 0x7f493af83840, args:
Locals at 0x7f493af83840, Previous frame's sp is 0x7f493af83850
Saved registers:
rip at 0x7f493af83848

代码使用 -g -fvar-tracking -O2 -march=native 编译。

我有各种崩溃的各种转储,所有这些都有符号表工作并提供相关的调用堆栈和信息,但由于某种原因,这个特定的崩溃是神秘的。

我注意到的一件事是相同的地址数字一遍又一遍地重复,这可能是某种无限循环或某种正在破坏或溢出堆栈的递归吗?
如果是这样,是否有任何方法可以获取调用堆栈中最顶层的函数(例如,可以通过任何方法超过帧 #765 或获取在触发溢出之前调用的函数)?

我无法将 $spjump 设置到任何地址,因为我无法调试和单步执行实时程序,只能分析核心转储。
我无法复制这种崩溃,它不时发生在生产中。 valgrind也是不可能的。

是否有任何 g++ 编译器选项或 gdb 标志可以帮助我解决这个问题?
任何关于如何调试此类问题的指示都将受到赞赏(如果可能的话)。

最佳答案

I have no idea how to debug the core dump since the call stack is missing symbols info for some reason

第 1 部分:

这种无意义的调用堆栈的最常见原因是生成核心转储的二进制文件与您用于实际分析核心的二进制文件不匹配。

如果您在链接时使用了 --build-id,或者如果您的 GCC 默认配置为使用该链接器标志,那么您可以验证二进制匹配(或不匹配) ) core 使用此过程:

readelf -n /path/to/binary

这应该产生类似于以下的输出:

$ readelf -n /bin/sleep

Displaying notes found at file offset 0x00000254 with length 0x00000020:
Owner Data size Description
GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag)
OS: Linux, ABI: 2.6.24

Displaying notes found at file offset 0x00000274 with length 0x00000024:
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: c266a51e4b85b16ca17bff8328f3abeafb577b29

build-id 字符串 c266a51e4b85b16ca17bff8328f3abeafb577b29 是您关心的输出。假设你的二进制文件有它,安装 elfutils 包,然后使用

eu-unstrip -n --core /path/to/core

查看在生成核心转储时使用了哪些二进制文件。

输出应该是这样的:

$ eu-unstrip -n --core /tmp/core
0x400000+0x208000 c266a51e4b85b16ca17bff8328f3abeafb577b29@0x400284 - - [exe]
0x7ffca5721000+0x1000 9c7cbcf6c957d8fc8e55b45a3c7a1556b38a3097@0x7ffca5721340 . - linux-vdso.so.1
0x7f491ad5a000+0x2241c8 d0f537904076d73f29e4a37341f8a449e2ef6cd0@0x7f491ad5a1d8 /lib64/ld-linux-x86-64.so.2 /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.19.so ld-linux-x86-64.so.2
0x7f491a995000+0x3c42c0 cf699a15caae64f50311fc4655b86dc39a479789@0x7f491a995280 /lib/x86_64-linux-gnu/libc.so.6 /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.19.so libc.so.6

在上面你可以看到这个core转储实际上是由/bin/sleep产生的。

如果 core 中的可执行文件 build-id 与您的二进制文件不匹配,您需要找到 build-id 与您的 core 匹配的二进制文件,然后才能提取更正 GDB 中的崩溃堆栈跟踪。

第 2 部分:

如果二进制确实匹配核心,那么很可能堆栈只是损坏了(例如由于堆栈缓冲区溢出)。

valgrind is out of the question.

无论如何,Valgrind 在检测堆栈损坏方面异常薄弱。

调试这类问题的当前技术水平是 Address Sanitizer ,这要快得多,并且可能足够快以在生产中运行。

如果经过清理的二进制文件的速度不足以用于生产,您可以将其设置为以“影子模式”处理某些输入子集(二进制文件运行,但其输出被丢弃)。您在此类设置中付出的任何努力都可能会发现 10 多个新错误,并且会为您节省大量的 future 调试工作。

关于c++ - GDB 调试缺少特定调用堆栈符号表的核心转储,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41921514/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com