🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

[C++] Replacing the applications call-stack

Started by
2 comments, last by Juliean 2 years ago

Hello,

another question that came up in regards to my JIT-implementation - specific to Win64/MSVC. This one is more of a cleanup/improvement, but still…

So for context, my visual-scripting supports coroutine-style “yield"ing, but with a lot more flexibilty and less restrictions. Notably, having nested yielding-method calls is working transitively w/o any overhead in terms of speed and also how the “code” has to be written. In the interpreter-implementation, this is done by having separate (smaller) stacks for potentially-yielding entries-frames, which are allocated from a pool and used instead of the primary stack. This allows both yield and resume to be O(1) after the initial start, regardless of how many layers deep the call-stack is at the point of the “yield”.

Now with JIT, things become more complicated. Regular methods are handled like a normal “call” opcode with a “ret" on the end. I'd like to use the same logic for yielding-methods, but the main problem here is the preservation of the stack-data in case of a “yield” and subsequent “resume”. In short, to achieve the same characteristics, I'd need to pre-emtively replace the applications stack with a custom one, which is then retained upon yield and later reinstated.

Now this is the primary question: How do I replace the stack of the application with a new one?

I know that RSP is the designated stack-pointer, and I know that I can potentially replace the value of this register to point to another memory-location to serve as a different stack. There is just one problem when trying this on windows:

Replacing RSP with my own stack screws with the programs unwind-information, meaning no stacktrace is present and exceptions are not handled properly while my own stack is used. This is even though I specify a register (R14) as frame-pointer, and register it in the unwind-information via UWOP_SET_FPREG. The process itself seems to work, since I can add/sub to RSP as much as I want and the unwind still works. So, is there anything else I have to do to make windows/MSVC be able to properly unwind when a custom stack is used? I've not been able to find a single source of information regarding this, so this is probably something that isn't done very often (regular coroutines apparently are implemented entirely different, but that type of implementation is not compatible with my yield-model). Still, somebody got any experience with this type of thing?

Advertisement

Juliean said:
How do I replace the stack of the application with a new one?

It's been a LONG time so I can't really give useful info, but know some caveats.

I did that kind of thing in simpler times about 20 years ago involving longjmps in C and assembly, and didn't involve cleaning up nontrivial data. Here you've got destructors, exceptions, and more complex stack unwinding.

For documents, I know at one point years ago I looked at the enormous PDFs describing the Windows x64 ABI so I know the docs exist but sadly don't know where find them today. They're likely somewhere in MSDN or Intel's sites.

Even back then there were lots of details about how the stack is used, and compiler-specific details in how space is used as the code is compiled, linked, and optimized, as assorted variables and data live there, exception information encoded there, plus you've got the stack's shadow space that needs to be managed. The build tools (compiler & linker) often assume they're the only systems modifying the details so they can have odd customizations. From disassembly I know that compiler/linker flags enabling various debugging functionality will also change how the stack and shadow space are used. So it isn't enough to do it one way, since debug builds, release builds, and other configurations can all be different.

Good luck hunting it down, it's an area few but compiler writers and disassembly readers will ever tread.

frob said:
I did that kind of thing in simpler times about 20 years ago involving longjmps in C and assembly, and didn't involve cleaning up nontrivial data. Here you've got destructors, exceptions, and more complex stack unwinding.

Actually, with the exception of exceptions and unwinding, I don't think the actual handling of the stack should matter that much for my case. The replacement of stack would happen in a well-defined place in the prologue of one of my entry-functions, so anything before this function would lie on the old stack, and anything beyond would then be placed on the new stack. This should mean that things like destructors on stack-local variables ie. should work without issues, because the variable is on the new stack, and the destructor would also be executed when RSP is still pointing to that new stack, so they should both naturally agree on the location of the variable in memory.

In fact, I belive that the whole layout of the stack shouldn't matter, as I said I already made a test-case which ran with the exception of unwind via the frame-pointer. MSDN has a page on stack-usage, but only due to their API/ABI and not in regards to the actual stack itself (https://docs.microsoft.com/en-us/cpp/build/stack-usage?view=msvc-170).​ From there, the only requirement I see is that RSP is 16-byte aligned. I also read from another source that the normal stack is handled by allocating entire pages at once, with a guard-page following to trigger an access-alarm when resizing is required. Maybe I should try allocating the stack in this way (I just used my normal new[]-based stack for my test).

But yeah, its not the end of the world if I don't get this to work. I've got a working implementation using my own stack-class for storing the return-information of yielding-methods (which still has to exist anyways). The advantage of replacing RSP would mainly be speed, as well as having a simpler way for creating a stack-trace from inside the yielding call-frame. But I recon this is something rather obscure, I'll try to figure out if there's even a way to do that, but I might not find a solution (maybe x64-msvc/windows just has some internal requirements that RSP points to the application-allocated stack).

This topic is closed to new replies.

Advertisement