我正在尝试使用正常页面堆(未满)测试崩溃场景(在独立的测试应用程序中)。
我已经设置了标志
gflags /p /enable Test.exe
我正在用一个元素覆盖整数缓冲区
...
const size_t s = 100;
vector<int> v1(s, 0);
int* v1_base = &v1[0];
write_to_memory_int(v1_base, s+1);
...
事实上,当该 block 在向量 d'tor 中被释放时,我得到了休息。中断的调用堆栈已正确报告:
0:005> kp
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr
0785faa4 11229df2 verifier!VerifierStopMessage+0x1f8
0785fb08 1122a22a verifier!AVrfpDphReportCorruptedBlock+0x1c2
0785fb64 1122a742 verifier!AVrfpDphCheckNormalHeapBlock+0x11a
0785fb84 112290d3 verifier!AVrfpDphNormalHeapFree+0x22
0785fba8 77951564 verifier!AVrfDebugPageHeapFree+0xe3
0785fbf0 7790ac29 ntdll!RtlDebugFreeHeap+0x2f
0785fce4 778b34a2 ntdll!RtlpFreeHeap+0x5d
0785fd04 750c14dd ntdll!RtlFreeHeap+0x142
0785fd18 71fc4c39 kernel32!HeapFree+0x14
0785fd64 00404b0a msvcr80!free(void * pBlock = 0x0726f7b8)+0xcd [f:\dd\vctools\crt_bld\self_x86\crt\src\free.c @ 110]
0785fd90 00402ac7 Test!std::vector<int,std::allocator<int> >::_Tidy
...
但是,当我查看错误分配时,我只得到以下信息:
0:005> !heap -p -a 0x0726f7b8
address 0726f7b8 found in
_HEAP @ 30000
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
0726f790 0039 0000 [00] 0726f7b8 00190 - (busy)
1122a6a7 verifier!AVrfpDphNormalHeapAllocate+0x000000d7
11228f6e verifier!AVrfDebugPageHeapAllocate+0x0000030e
77950d96 ntdll!RtlDebugAllocateHeap+0x00000030
7790af0d ntdll!RtlpAllocateHeap+0x000000c4
778b3cfe ntdll!RtlAllocateHeap+0x0000023a
也就是说,有一个分配堆栈跟踪,但它停在 RtlAllocateHeap
这显然完全没用。
查看内存中的堆栈跟踪:
dt _DPH_BLOCK_INFORMATION ....-0x20
=>
0:005> dds 0x03e556f4
03e556f4 00000000
03e556f8 00002050
03e556fc 00050000
03e55700 1122a6a7 verifier!AVrfpDphNormalHeapAllocate+0xd7
03e55704 11228f6e verifier!AVrfDebugPageHeapAllocate+0x30e
03e55708 77950d96 ntdll!RtlDebugAllocateHeap+0x30
03e5570c 7790af0d ntdll!RtlpAllocateHeap+0xc4
03e55710 778b3cfe ntdll!RtlAllocateHeap+0x23a
03e55714 00000000
03e55718 00003001
03e5571c 0004005e
看起来实际上没有更多记录。
如何修复页堆以记录有用的堆栈跟踪?
请注意,测试项目不是使用 FPO (/Oy) 编译的,我不会期望有 RtlAllocateHeap
会受到 FPO 的影响吗?
更新:我通过手动进入分配检查了相关调用的 FPO 性(见下文),看起来 malloc
以及 op new
的 VC80(VS2005) 运行时库启用了某种形式的 FPO...所以这可能会弄乱页堆堆栈数据库的堆栈跟踪。
0:004> kv
ChildEBP RetAddr Args to Child
077efa7c 77c8af0d 05290000 01001002 00000190 ntdll!RtlDebugAllocateHeap+0x16 (FPO: [Non-Fpo])
077efb60 77c33cfe 00000190 00000000 00000000 ntdll!RtlpAllocateHeap+0xc4 (FPO: [Non-Fpo])
077efbe4 72344d83 05290000 01001002 00000190 ntdll!RtlAllocateHeap+0x23a (FPO: [Non-Fpo])
077efc04 62f595ee 00000190 00000000 00000000 MSVCR80!malloc+0x7a (FPO: [1,0,0]) (CONV: cdecl)
077efc1c 00406a44 00000190 ebecf74f 00000001 MFC80U!operator new+0x2f (FPO: [Uses EBP] [1,0,0]) (CONV: cdecl)
077efc48 00405479 00000064 00000000 3fffffff Test!std::_Allocate<ATL::CStringT<wchar_t,StrTraitMFC_DLL<wchar_t,ATL::ChTraitsCRT<wchar_t> > > >+0x84 (FPO: [Non-Fpo]) (CONV: cdecl)
077efcb8 004049f4 00000064 ebecf68f 00000000 Test!std::vector<unsigned int,std::allocator<unsigned int> >::_Buy+0x69 (FPO: [Non-Fpo]) (CONV: thiscall)
077efd88 00402a4f 00000064 077efdc0 ebecf44b Test!std::vector<int,std::allocator<int> >::_Construct_n+0x44 (FPO: [Non-Fpo]) (CONV: thiscall)
077eff4c 72342848 00000000 ebec8474 00000000 Test!crashFN+0x35f (FPO: [Non-Fpo]) (CONV: cdecl)
077eff84 723428c8 75da33aa 072ab3d8 077effd4 MSVCR80!_callthreadstart+0x1b (FPO: [Non-Fpo]) (CONV: cdecl)
077eff88 75da33aa 072ab3d8 077effd4 77c39f72 MSVCR80!_threadstart+0x5a (FPO: [1,0,0]) (CONV: stdcall)
077eff94 77c39f72 072ab3d8 70fca8b2 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])
077effd4 77c39f45 7234286e 072ab3d8 00000000 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])
077effec 00000000 7234286e 072ab3d8 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])
最佳答案
感谢@Marc Sherman 在评论中指出我应该检查真实的分配堆栈跟踪。
正如问题中已经编辑的那样,VC80(VS2005) 是这里的问题,因为它的 CRT 启用了 FPO,如堆栈跟踪所示:
MSVCR80!malloc+0x7a (FPO: [1,0,0]) (CONV: cdecl)
MFC80U!operator new+0x2f (FPO: [Uses EBP] [1,0,0]) (CONV: cdecl)
现在,有了一个可以搜索的 anchor ,我们发现了以下内容:
Why does every heap trace in UMDH get stuck at “malloc”?
添加一些引号:
In particular, it would appear that the default malloc implementation on the static link CRT on Visual C++ 2005 not only doesn’t use a frame pointer, but it trashes ebp as a scratch register ...
What does this all mean? Well, anything using malloc that’s built with Visual C++ 2005 won’t be diagnosable with UMDH or anything else that relies on ebp-based stack traces, at least not on x86 builds.
评论中还有一个回复,提供了很好的信息:
Mark Roberts [MSFT] says: February 25, 2008 at 3:03 pm
Hello,
Enabling FPO for the 8.0 CRT was not deliberate. The Visual Studio 2008 CRT (9.0) does NOT have FPO enabled, and UMDH should function normally.
For 8.0, an alternative to UMDH would be to use LeakDiag. LeakDiag will actually instrument memory allocators to obtain stack traces. This makes it more versatile than UMDH as it can hook several different allocator types at different granularities (Ranging from the c runtime to raw virtual memory allocations).
By default, LeakDiag simply walks the stack base pointers, but it can be modified to use the Dbghlp StackWalkAPI to resolve FPO data. This will produce full stacks, though the performance penalty is higher. On the flip side, you can customize the stack walking behavior to only go to a certain depth, etc to minimize the perf penalty.
Please find LeakDiag here: ftp://ftp.microsoft.com/PSS/Tools/Developer%20Support%20Tools/LeakDiag/leakdiag125.msi
关于debugging - 页堆没有记录有用的堆栈信息?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19028249/