|
|
| Walking the call stack as fast as possible |
|
| Author |
Message |
Iulian Radu

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
Hello !
I have dedicated some time to developing a memory allocator that replaces the default crt allocator. Things are almost finished, but recently i've added a leak trace that saves the call stack for every memory allocation. (The stack is saved in the process memory, so there's no interprocess communication overhead.) That adds a very important overhead to the allocator and i was wondering if there is any efficient way of getting the call stack. At the moment i'm using StackWalk64 and the profiler i'm using (CodeAnalist) reports that 99+% of the time is spent in dbghelp.dll, ntoskrnl.dll and msvcrxx.dll (strangely, the functions used from here are wcslen and its family even though i don't use them in my code). Even more strange is the fact that only 5-6 % of the time spent in dbghelp.dll is used by StackWalk64 and the bulk of it is used by MiniDumpWriteDump. That being said, is there any faster way of getting the call stack I don't expect the code to be as fast as without the trace, but i think a 20-50x slowdown would be better than this. Now the speed of the allocator has decreased by 3 orders of magnitude. (By the way, i'm not getting a full stack, just the first 10 frames)
Visual C++7
|
| |
|
| |
 |
Ayman Shoukry

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
|
| |
 |
Iulian Radu

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
Ayman Shoukry wrote: | |
I have just found about the DIA SDK a couple of hours ago. Is it faster
|
| |
|
| |
 |
Ayman Shoukry

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
To be honest, I don't have much experience myself with DIA (it is probably worth a try) but I will forward the issue to one of the folks on our team who should know more than myself :-)
Thanks,
Ayman Shoukry
VC++ Team
|
| |
|
| |
 |
Milis

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
It depends actually. If the function frames are nicely stored in the stack and easy to find then yes it is pretty fast. But we all know that is not the case espeacially when the application is an optimized one. Still it should give you better performance then using StackWalk64.
Unfortunately there's no good example yet in the product on how to do stack walking using DIA SDK. I was hopping to get one in time for shipping but I was to late. The documentation should be enough though given that you know already a lot of stuff from working with dbghelp api's wich is very similar to how DIA SDK stack walker works.
Thanks,
Milis
VC++ team
|
| |
|
| |
 |
Iulian Radu

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
Thank you very much. I'll give it a try.
|
| |
|
| |
 |
KasperWessing

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
Hi Iulian,
Did you managed to create a working stack walk with DIA, We like to do the same thing as you, but I find it a bit unclear how I have to create the implementation of IDiaStackWalkHelper and how I finally create the IDiaStackWalkHelper object, so If you or anyone else (i.e. VC++ team at Microsoft) already has done this I like to use it as sample code.
Thx in advance, Kasper
|
| |
|
| |
 |
Iulian Radu

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
KasperWessing wrote: | Hi Iulian,
Did you managed to create a working stack walk with DIA, We like to do the same thing as you, but I find it a bit unclear how I have to create the implementation of IDiaStackWalkHelper and how I finally create the IDiaStackWalkHelper object, so If you or anyone else (i.e. VC++ team at Microsoft) already has done this I like to use it as sample code.
Thx in advance, Kasper
|
|
Hello !
I haven't tried using it since i didn't expect a magical increase in speed. I've fiddled with /GH /Gh switches and added _penter and _pexit functions that keep track of the stack. Actually, i've only been using /Gh since VC++ 2005 had a bug in placing _pexit properly in all cases. They fixed that in the patch and i recommend you to use the /GH switch too since the trick that I've used to emulate the call to my _pexit handler made the stack untraversable during debugging.
The _penter/_pexit solution I've implemented only slows the program by 5-10% i think(versus 1-2000% with the WinDBG). Of course, different calling behaviors have different performance penalties, but it's much better than the all purpose solution provided by DIA or WinDBG. The main limitation is that you can't use a plugin system since it requires recompilation and you can only trace your executables.
I hope you can use _penter /_pexit to solve your problem.
Sincerely, Iulian Radu
|
| |
|
| |
 |
KasperWessing

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
Hi Iulian,
It works like charme,
Thx, Kasper
|
| |
|
| |
 |
dosler

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
Hi there, Iulian!
you could benefit from the undocumented (at least not for winxp) ntdll!RtlCaptureStackBackTrace
there's some info on the windows 2003 server implementation in msdn but i found it equally applicable to winxp. It is a low-overhead stack capture routine which works with restrictions : it does not walk FPO-optimized frames (by design) and it does not use symbols (also by design - that's where you get the speed benefit). Once the stack traces have been taken they can be instrumented with symbols at any convenient point later on (using dbghelp or DIA). This is how the debugging page heap functionality is implemented in the system itself (these snapshots must be fast!)
Hope it is still helpfull!
dmitri.
|
| |
|
| |
 |
YCY

|
Posted: Visual C++ General, Walking the call stack as fast as possible |
Top |
Does anyone ever use the DIA to walk stack with a module created with FPO (frame pointer omission) Does it work
|
| |
|
| |
 |
| |
|