On 08.01.2018 22:55, Jim Mattson wrote: > On Mon, Jan 8, 2018 at 1:46 PM, David Hildenbrand <david@xxxxxxxxxx> wrote: >>> The vmcs01 does serve as a cache of L1 state at the time of VM-entry, >>> so if we simply restored the vmcs01 state, that would take care of >>> most of the rollback issues, as long as we don't deliver any >>> interrupts in this context. However, I would like to see the >>> vmcs01/vmcs02 separation go away at some point. svm.c seems to do fine >>> with just once VMCB. >> >> Interesting point, might make things easier for VMX. >> >>> >>>> If the page >>>>> tables have changed, or the L1 guest has overwritten the >>>>> VMLAUNCH/VMRESUME instruction, then you're out of luck. >>>> >>>> Page tables getting changed by other CPUs is actually a good point. But >>>> I would consider both as "theoretical" problems. At least compared to >>>> the interrupt stuff, which could also happen on guests behaving in a >>>> more sane way. >>> >>> My preference is for solutions that are architecturally correct, >>> thereby solving the theoretical problems as well as the empirical >>> ones. However, I grant that the Linux community leans the other way in >>> general. >> >> I usually agree, unless it makes the code horribly complicated without >> any real benefit. (e.g. for corner cases like this one: a CPU modifying >> instruction text of another CPU which is currently executing them) > > I don't feel that one additional bit of serialized state is horribly > complicated. It's aesthetically unpleasant, to be sure, but not > horribly complicated. And it's considerably less complicated than your > proposal. :-) Well, nested_run_pending is just the tip of the ice berg of the (in my opinion complicated) part of exiting to user space with nested state, exposing all* VCPU ioctls to it. But as we learned, it's the little things that cause big problems :) -- Thanks, David / dhildenb