On 08.01.2018 21:59, Jim Mattson wrote: > On Mon, Jan 8, 2018 at 12:27 PM, David Hildenbrand <david@xxxxxxxxxx> wrote: >> On 08.01.2018 21:19, Jim Mattson wrote: >>> Even more trivially, what if the L2 VM is configured never to leave >>> VMX non-root operation? Then we never exit to userspace? >> >> Well we would make the PC point at the VMLAUNCH. Then exit to userspace. > > That doesn't work, for so many reasons. > > 1. It's not sufficient to just rollback the instruction pointer. You > also need to rollback CS, CR0, CR3 (and possibly the PDPTEs), and CR4, > so that the virtual address of the instruction pointer will actually > map to the same physical address as it did the first time. I expect these values to be the same once leaving non-root mode (as the CPU itself hasn't executed anything except the nested guest) But yes, it could be tricky. If the page > tables have changed, or the L1 guest has overwritten the > VMLAUNCH/VMRESUME instruction, then you're out of luck. Page tables getting changed by other CPUs is actually a good point. But I would consider both as "theoretical" problems. At least compared to the interrupt stuff, which could also happen on guests behaving in a more sane way. > 2. As you point out, interrupts are a problem. Interrupts can't be > delivered in this context, because the vCPU shouldn't be in this > context (and the guest may have already observed the transition to > L2). Yes, I also see this as the major problem. > 3. I'm assuming that you're planning to store most of the current L2 > state in the cached VMCS12, at least where you can. Even so, the next > "VM-entry" can't perform any of the normal VM-entry actions that would > clobber the current L2 state that isn't in the cached VMCS12 (e.g. > processing the VM-entry MSR load list). So, you need to have a flag > indicating that this isn't a real VM-entry. That's no better than > carrying the nested_run_pending flag. Not sure if that would really be necessary (would have to look into the details first). But sounds like nested_run_pending seems unavoidable on x86. So I'd better get used to QEMU dealing with nested CPU state (which is somehow scary to me - an emulator getting involved in nested execution - what could go wrong :) ) Good we talked about it (and thanks for your time). I learned a lot today! >> >> -- >> >> Thanks, >> >> David / dhildenb -- Thanks, David / dhildenb