On 08.01.2018 19:11, Paolo Bonzini wrote: > On 08/01/2018 18:59, David Hildenbrand wrote: >> On 08.01.2018 18:36, Paolo Bonzini wrote: >>> On 08/01/2018 11:35, David Hildenbrand wrote: >>>> Thinking about it, I agree. It might be simpler/cleaner to transfer the >>>> "loaded" VMCS. But I think we should take care of only transferring data >>>> that actually is CPU state and not special to our current >>>> implementation. (e.g. nested_run_pending I would says is special to out >>>> current implementation, but we can discuss) >>>> >>>> So what I would consider VMX state: >>>> - vmxon >>>> - vmxon_ptr >>>> - vmptr >>>> - cached_vmcs12 >>>> - ... ? >>> >>> nested_run_pending is in the same boat as the various >>> KVM_GET_VCPU_EVENTS flags (e.g. nmi.injected vs. nmi.pending). It's not >>> "architectural" state, but it's part of the state machine so it has to >>> be serialized. >> >> I am wondering if we can get rid of it. In fact if we can go out of VMX >> mode every time we go to user space. > > There are cases where this is not possible, for example if you have a > nested "assigned device" that is emulated by userspace. You mean L0 emulates a device for L1 (in userspace). L1 assigns this device as "assigned device" to L2. Now if L2 tries to access the device, we have to go in L0 into userspace to handle the access, therefore requiring us to stay in nested mode so we can properly process the request when re-entering KVM from userspace? Didn't know that was even possible. > > Paolo -- Thanks, David / dhildenb