On August 11, 2017 9:57:13 AM PDT, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: >On Fri, Aug 11, 2017 at 09:22:11AM -0700, Andy Lutomirski wrote: >> On Fri, Aug 11, 2017 at 5:13 AM, tip-bot for Josh Poimboeuf >> <tipbot@xxxxxxxxx> wrote: >> > Commit-ID: bf4d1a83758368c842c94cab9661a75ca98bc848 >> > Gitweb: >http://git.kernel.org/tip/bf4d1a83758368c842c94cab9661a75ca98bc848 >> > Author: Josh Poimboeuf <jpoimboe@xxxxxxxxxx> >> > AuthorDate: Thu, 10 Aug 2017 16:37:26 -0500 >> > Committer: Ingo Molnar <mingo@xxxxxxxxxx> >> > CommitDate: Fri, 11 Aug 2017 14:06:15 +0200 >> > >> > objtool: Track DRAP separately from callee-saved registers >> > >> > When GCC realigns a function's stack, it sometimes uses %r13 as the >DRAP >> > register, like: >> > >> > push %r13 >> > lea 0x10(%rsp), %r13 >> > and $0xfffffffffffffff0, %rsp >> > pushq -0x8(%r13) >> > push %rbp >> > mov %rsp, %rbp >> > push %r13 >> > ... >> > mov -0x8(%rbp),%r13 >> > leaveq >> > lea -0x10(%r13), %rsp >> > pop %r13 >> > retq >> > >> >> I have a couple questions, mainly to help me understand. >> >> Question 1: What does DRAP stand for? Duplicate Return Address >> Pointer? Dynamic ReAlignment Pointer? I tried searching and got >> nothing. > >It seems to be a GCC invention which stands for: > > Dynamic Realign Argument Pointer. > >I don't think it's documented anywhere, but there's at least some >comments about it in the GCC sources if you search for DRAP. > >> Question 2: What's up with the resulting stack layout? It seems we >have: >> >> caller's last stack slot <-- r13 in function body points here >> return address >> old r13 >> [possible padding for alignment] >> return address, duplicated (for naive unwinder's benefit?) >> old rbp <-- rbp in body points here >> new r13, i.e. pointer to caller's last stack slot >> >> Now we have the function body, and r13 is free for use in here >because >> it's saved. >> >> In the epilogue, we recover r13, use leaveq (hmm, shorter than pop >> %rbp but does more work than needed), restore the old r13, and >return. >> >> I don't get it, though. gcc only ever uses that inner r13 with an >> offset. The code would be considerably shorter if the second >> instruction were just mov %rsp, %r13. That would change the push to >> pushq 0x8(%rsp) and the third-to-last instruction to mov %r13, %rsp, >> saving something like 8 bytes of code. > >I don't know why it doesn't do it the way you suggest, but I'm glad it >doesn't because I think it would make the DWARF/ORC data even more >complicated. Here it's "simple", because r13 == DWARF CFA. > >> I also don't get why any of this is needed. Couldn't the compiler >> just do push %rbp; mov %rsp, %rbp; and $0xfffffffffffffff0, %rsp and >> be done with it? > >Good question. I wish it did just use the frame pointer, because >dealing with DRAP has been a headache. > >> I compiled this: >> >> void func() >> { >> int var __attribute__((aligned(32))); >> asm volatile ("" :: "m" (var)); >> } >> >> and got: >> >> func: >> leaq 8(%rsp), %r10 >> andq $-32, %rsp >> pushq -8(%r10) >> pushq %rbp >> movq %rsp, %rbp >> pushq %r10 >> popq %r10 >> popq %rbp >> leaq -8(%r10), %rsp >> ret >> >> Which is better than the crud you pasted, since it at least uses a >> caller-saved reg (r10), but we still have the nasty addressing modes >> *and* an unnecessary push and pop of r10. >> >> I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81825 and maybe >> some GCC person has a clue what's going on. > >I've found that, when it does this DRAP pattern, most of the time it >uses r10. The r13 version seems to be more rare. I can provide a >real-world r13 example if that would help. One could logically assume %r10 if a clobbered register is sufficient. It would make sense to do that renaming fairly late in the game. -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html