On 07/29, Andy Lutomirski wrote: > > > SAVE_REST is 6 movq instructions and a subq. FIXUP_TOP_OF_STACK is 7 > > movqs (and 8 if I ever get my way). RESTORE_TOP_OF_STACK is 4. > > RESTORE_REST is 6 movqs and an adsq. So we're talking about avoiding > > 21 movqs, and addq, and a subq. That may be significant. (And I > > suspect that the difference is much larger on platforms like arm64, > > but that's a separate issue.) OK, thanks. We could probably simplify the logic in phase1 + phase2 if it was a single function though. > To put some more options on the table: there's an argument to be made > that the whole fast-path/slow-path split isn't worth it. We could > unconditionally set up a full frame for all syscalls. This means: Or, at least, can't we allocate the full frame and avoid "add/sub %rsp"? > This means: ... > On the > other hand, there's zero chance that this would be ready for 3.17. > > I'd tend to advocate for keeping the approach in my patches for now. Yes, sure, I didn't try to convince you to change this code. Thanks. Oleg.