我一直在尝试调试导致段错误的服务问题。我无权访问生产服务器,因此我在我的服务中处理了 SIGSEGV 信号并在日志文件中打印了堆栈跟踪。以下是服务崩溃时的堆栈跟踪
0# 0x00000000005054DA in ./afiniti_lookup
1# 0x00007F2BBB74A400 in /usr/lib64/libc.so.6
2# 0x00007F2BBB86F9BD in /usr/lib64/libc.so.6
3# 0x000000000041BB52 in ./afiniti_lookup
4# std::string::_M_move(char*, char const*, unsigned long) in ./afiniti_lookup
5# std::string::_M_mutate(unsigned long, unsigned long, unsigned long) in ./afiniti_lookup
6# std::string::_M_replace_safe(unsigned long, unsigned long, char const*, unsigned long) in ./afiniti_lookup
7# std::string::assign(char const*, unsigned long) in ./afiniti_lookup
8# std::string::assign(char const*) in ./afiniti_lookup
9# std::string::operator=(char const*) in ./afiniti_lookup
10# 0x000000000061E8E9 in ./afiniti_lookup
11# 0x0000000000620200 in ./afiniti_lookup
12# 0x000000000055B586 in ./afiniti_lookup
13# 0x00000000004F2BAC in ./afiniti_lookup
14# 0x00000000004F0715 in ./afiniti_lookup
15# 0x000000000051CDBF in ./afiniti_lookup
16# 0x0000000000529869 in ./afiniti_lookup
17# 0x0000000000464968 in ./afiniti_lookup
18# 0x0000000000461369 in ./afiniti_lookup
19# 0x0000000000460D6E in ./afiniti_lookup
20# 0x0000000000460086 in ./afiniti_lookup
21# 0x000000000045FD36 in ./afiniti_lookup
22# 0x000000000046CAB4 in ./afiniti_lookup
23# 0x000000000046B4F6 in ./afiniti_lookup
24# 0x000000000046FF13 in ./afiniti_lookup
25# 0x000000000046FE65 in ./afiniti_lookup
26# 0x000000000046FCDA in ./afiniti_lookup
27# 0x00007F2BBCE5038F in /opt/lib64/libcpprest.so.2.10
28# 0x00007F2BBEDCAEA5 in /usr/lib64/libpthread.so.0\n29# clone in /usr/lib64/libc.so.6
但是,此跟踪没有多大用处,因为我无法在代码中查明问题发生的位置。有人可以帮助我更好地理解和检查这个堆栈跟踪吗?
最佳答案
Can somebody help me better understand and inspect this stacktrace?
看起来您在生产中有一个部分剥离的可执行文件。
您应该有一个未剥离的拷贝(通过链接您的可执行文件生成)。如果你不这样做,你需要改变你的方式,并在你之前保存一份拷贝
strip
.使用未剥离的拷贝,您可以像这样理解堆栈跟踪:
addr2line -fe afiniti_lookup.unstripped 0x61E8E9 0x620200 0x55B586 ...
这是示例输出:cat foo.c
int foo() { int *ip = 0; return *ip; }
int bar() { return foo(); }
int zoo() { return bar(); }
int main() { return zoo(); }
用调试信息编译它:gcc -g foo.c
(产生 a.out
)。剥离“生产”的二进制文件:
strip --strip=all a.out -o b.out
.运行
b.out
在 GDB 下模拟生产堆栈跟踪:(gdb) run
Starting program: /tmp/b.out
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401112 in ?? ()
(gdb) bt
#0 0x0000000000401112 in ?? ()
#1 0x0000000000401124 in ?? ()
#2 0x0000000000401134 in ?? ()
#3 0x0000000000401144 in ?? ()
#4 0x00007ffff7dfbcca in __libc_start_main (main=0x401136, argc=1, argv=0x7fffffffdc98, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdc88) at ../csu/libc-start.c:308
#5 0x000000000040104a in ?? ()
现在使用 addr2line
在未剥离的二进制文件上以了解上面的堆栈跟踪:addr2line -fe a.out 0x0000000000401112 0x0000000000401124 0x0000000000401134 0x0000000000401144
foo
/tmp/foo.c:1
bar
/tmp/foo.c:2
zoo
/tmp/foo.c:3
main
/tmp/foo.c:4
附言对于实际生产使用,理想情况下,您将使用 gcc -O2 -g ...
编译您的二进制文件。 ,所以你有完整的调试信息,然后 strip
二进制文件(但保留完整的调试拷贝)。通过这种方式,您可以相当轻松地从生产环境中调试核心转储,并访问函数、文件、行和变量。
关于c++ - 理解堆栈跟踪,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64367642/