Zachary Amsden wrote: > Fabrice Bellard wrote: > >> Hi, >> >> As something like VMI is expected to be supported soon by QEMU, I have >> a small question: does the virtualization API you are defining >> supports that the guest kernel code can be executed in ring 3 ? >> >> In QEMU for example, the guest kernel code can be executed either by >> the dynamic translator (in this case CS.rpl = 0 and SS.rpl = 0) or by >> the kqemu kernel module (in this case CS.rpl = 3 and SS.rpl = 3). So a >> good behaviour would be to ignore the rpl field of both CS and SS in >> kernel mode. >> > > > This requires a state tracking variable visible to the kernel, either > directly, or abstracted via a function call, to answer, "did the > previous interrupt context come from user mode?". That requires either > changing the stack exception frame, or maintaining a parallel context > stack, as well as adding a new paravirt-op, was_user_mode(u32 stack). > This seems rather heavyweight, and without careful design, could > possibly change the machine architecture such that the kernel can no > longer run properly on native hardware. It is required, however, for > x86_64 kernels, which will execute kernel code in ring 3. Things are a > bit easier on x86_64 because of interrupt stacks. Definitely something > to think about now, but I don't imagine it will be very possible to do > on 32-bit without negatively impacting the critical path during return > from kernel code. I was not clear enough: QEMU puts the right RPL when CS and SS are pushed on the stack for a system call or an exception. So the guest kernel can know if it was called from user space or kernel space. The only problem is if the guest kernel does a "push %cs" or a "mov %eax, %cs" to get the CS or SS rpl. As you said, on x86_64 the guest kernel code must be executed in ring 3, so you will have this problem on x86_64 anyway. Fabrice.