Re: [RFC 1/2] KVM/nVMX: Cleanly exit from L2 to L1 on user-space exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Now that I looked at the Jim's patch and also went through *most* of 
the comments, I think I realized that my approach is reasonable and in 
fact I do not see any of the downsides mentioned in the other thread 
any more. Specially the one about handling VMLAUNCH case, the code now 
handles this cleanly and using an approach that is already available
in the code base (there is precedent).

Only thing that is not handled, AFAICT, is:

> The other unfortunate thing about flushing the "current" VMCS12 state
> to guest memory on each L2->userspace transition is that much of this
> state is in the VMCS02. So,it's not just a matter of writing a
> VMCS12_SIZE blob to guest memory; first, the cached VMCS12 has to be
> updated from the VMCS02 by calling sync_vmcs12(). This will be
> particularly bad for double-nesting, where L1 kvm has to take all of
> those VMREAD VM-exits.

.. which is something I can actually fix if needed, but is there really
anyone doing this today? Do we actually need to optimize for this at
all?

Is there any thing else that I am missing?

So what are the upsides for my approach:

1- It ensures that user-space tools that does not understand nesting
   can still see the expected guest state when querying guest state or 
   even when trying to read memory, translate an address, etc.
2- It is very simple and does not require a whole lot of state in user-
   space.
3- It's even rebased on master :) (Ok, maybe this is not a technical 
   reason :D)

Thoughts?

On a side note: I have also fixed the VMEntry issue that I mentioned
in the commit message and I have done hundreds of save/resume 
successfully already.

On Fri, 2018-02-16 at 16:23 +0100, KarimAllah Ahmed wrote:
> On 02/16/2018 03:52 PM, Paolo Bonzini wrote:
> > 
> > On 16/02/2018 15:23, KarimAllah Ahmed wrote:
> > > 
> > > On exit to L0 user-space, always exit from L2 to L1 and
> > > synchronize the
> > > state properly for L1. This ensures that user-space only ever
> > > sees L1
> > > state. It also allows L1 to be saved and resumed properly.
> > > Obviously
> > > horrible things will still happen to the L2 guest. This will be
> > > handled in
> > > a seperate patch.
> > > 
> > > There is only a single case which requires a bit of extra care.
> > > When the
> > > decision to switch to user space happens while handling an L1
> > > VMRESUME/VMLAUNCH (i.e. pending_nested_run). In order to handle
> > > this
> > > as cleanly as possible without major restructuring, we simply do
> > > not exit
> > > to user-space in this case and give L2 another chance to actually
> > > run. We
> > > also request an immediate exit to ensure that an exit to user
> > > space will
> > > still happen for the L2.
> > > 
> > > The only reason I can see where an exit to user space will occur
> > > while L2
> > > is running is because of a pending signal. The is how user space
> > > preempts
> > > the KVM_RUN in order to save the state. L2 exits are either
> > > handled in L0
> > > kernel or reflected to L1 and not handled in L0 user-space.
> > > 
> > > Signed-off-by: KarimAllah Ahmed <karahmed@xxxxxxxxx>
> > 
> > We discussed this with Jim about one year ago and then again last
> > January.  While I (in 2017) and David H. (in 2018) also thought
> > about
> > doing an L2->L1 exit like this, Jim quickly got me to change my
> > mind---it doesn't really seem like a good idea compared to doing
> > full
> > checkpointing of VMX state.  You can find the discussion at
> > https://patchwork.kernel.org/patch/9454799/.
> > 
> > Of course, Jim's series (first posted Nov 2016) is way more complex
> > than
> > yours, but the good news is that most of his changes have already
> > been
> > merged; the only ones missing are:
> > 
> > https://patchwork.kernel.org/patch/9454799/
> >   [7/8] kvm: nVMX: Introduce KVM_CAP_VMX_STATE
> > 
> > https://patchwork.kernel.org/patch/9454797/
> >   [8/8] kvm: nVMX: Defer gpa->hpa lookups for set_vmx_state
> 
> Oh! Thank you for pointing this out. Somehow I did not notice any of
> this :)
> 
> I was also thinking about doing a full save of VMX state then I
> decided
> to do the switch instead.
> 
> In any case, Looking forward to see those bits in master.
> 
> > 
> > 
> > The main request was to make [7/8] a bit more generic so that it
> > can be
> > applied to SVM as well.  That's pretty simple though.
> > 
> > Thanks,
> > 
> > Paolo
> > 
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux