On 15.06.23 17:18, Sean Christopherson wrote: > On Wed, Jun 14, 2023, Mathias Krause wrote: >> On 13.06.23 02:32, Sean Christopherson wrote: >>> On Thu, Apr 13, 2023, Mathias Krause wrote: >>>> Use an array type to refer to the code label 'ret_to_kernel'. >>> >>> Why? Taking the address of a label when setting what is effectively the target >>> of a branch seems more intuitive than pointing at an array (that's not an array). >> >> Well, the flexible array notation is what my understanding of referring >> to a "label" defined in ASM is. I'm probably biased, as that's a pattern >> used a lot in the Linux kernel but trying to look at the individual >> semantics may make it clearer why I prefer 'extern char sym[]' over >> 'extern char sym'. >> >> The current code refers to the code sequence starting at 'ret_to_kernel' >> by creating an alias of it's first instruction byte. However, we're not >> interested in the first instruction byte at all. We want a symbolic >> handle of the begin of that sequence, which might be an unknown number >> of bytes. But that's also a detail that doesn't matter. We only what to >> know the start. By referring to it as 'extern char' implies that there >> is at least a single byte. (Let's ignore the hair splitting about just >> taking the address vs. actually dereferencing it (which we do not).) By >> looking at another code example, that byte may not actually be there! >> >From x86/vmx_tests.c: >> >> extern char vmx_mtf_pdpte_guest_begin; >> extern char vmx_mtf_pdpte_guest_end; >> >> asm("vmx_mtf_pdpte_guest_begin:\n\t" >> "mov %cr0, %rax\n\t" /* save CR0 with PG=1 */ >> "vmcall\n\t" /* on return from this CR0.PG=0 */ >> "mov %rax, %cr0\n\t" /* restore CR0.PG=1 to enter PAE mode */ >> "vmcall\n\t" >> "retq\n\t" >> "vmx_mtf_pdpte_guest_end:"); >> >> The byte referred to via &vmx_mtf_pdpte_guest_end may not even be mapped >> due to -ftoplevel-reorder possibly putting that asm block at the very >> end of the compilation unit. >> >> By using 'extern char []' instead this nastiness is avoided by referring >> to an unknown sized byte sequence starting at that symbol (with zero >> being a totally valid length). We don't need to know how many bytes >> follow the label. All we want to know is its address. And that's what an >> array type provides easily. > > I think my hangup is that arrays make me think of data, not code. I can say the same for 'extern char' -- that's clearly data, not even r/o data, less so .text ;) > I wouldn't > object if this were new code, but I dislike changing existing code to something > that's arguably just as flawed. Fair enough. > What if we declare the label as a function? That's not exactly correct either, > but IMO it's closer to the truth, and let's us document that the kernel trampoline > has a return value, i.e. preserves/outputs RAX. No, please don't. It's *not* a function, not at all. First thing that code block does is switching the stack, wich is already a bummer. But it also has no return instruction, making it fall-through to the middle of run_in_user() and execute code out of context. It's really just a label to mark the beginning of a code sequence we have to care about. Now, we do not want to ever call ret_to_kernel() from C, but declaring it as a function just feels way more worse than having a "wrong" extern char declaration. So, let's leave the code as-is. It's not broken, just imperfect. > > diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c > index c3ec0ad7..a46a369a 100644 > --- a/lib/x86/usermode.c > +++ b/lib/x86/usermode.c > @@ -31,17 +31,18 @@ static void restore_exec_to_jmpbuf_exception_handler(struct ex_regs *regs) > #endif > } > > +uint64_t ret_to_kernel(void); > + > uint64_t run_in_user(usermode_func func, unsigned int fault_vector, > uint64_t arg1, uint64_t arg2, uint64_t arg3, > uint64_t arg4, bool *raised_vector) > { > - extern char ret_to_kernel; > volatile uint64_t rax = 0; > static unsigned char user_stack[USERMODE_STACK_SIZE]; > handler old_ex; > > *raised_vector = 0; > - set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); > + set_idt_entry(RET_TO_KERNEL_IRQ, ret_to_kernel, 3); > old_ex = handle_exception(fault_vector, > restore_exec_to_jmpbuf_exception_handler); > >