Re: RFC: few questions about hypercall patching in KVM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2022-12-15 at 00:53 +0000, Sean Christopherson wrote:
> On Wed, Dec 14, 2022, Maxim Levitsky wrote:
> > Hi!
> > 
> > 
> > Recently I had to debug a case of KVM's hypercall patching failing in a
> > special case of running qemu under valgrind.
> >  
> > In nutshell what is happening is that qemu uses 'cpuid' instruction to gather
> > some info about the host and some of it is passed to the guest cpuid, and
> > that includes the vendor string.
> >  
> > Under valgrind it emulates the CPU (aka TCG), so qemu sees virtual cpu, with
> > virtual cpuid which has hardcoded vendor string the 'GenuineIntel', so when
> > your run qemu with KVM on AMD host, the guest will see Intel's vendor string
> > regardless of other '-cpu' settings (even -cpu host)
> >  
> > This ensures that the guest uses the wrong hypercall instruction (vmcall
> > instead of vmmcall), and sometimes it will use it after the guest kernel
> > write protects its memory.  This will lead to a failure of the hypercall
> > patching as the kvm writes to the guest memory as if the instruction wrote to
> > it, and this checks the permissions in the guest paging.
> > 
> > So the VMCALL instruction gets totally unexpected #PF.
> 
> Yep, been there, done that :-)
> 
> > 1. Now I suggest that when hypercall patching fails, can we do
> > kvm_vm_bugged() instead of forwarding the hypercall?  I know that vmmcall can
> > be executed from ring 3 as well, so I can limit this to hypercall patching
> > that happens when guest ring is 0.
> 
> And L1.  But why?  It's not a KVM bug per se, it's a known deficiency in KVM's
> emulator.  What to do in response to the failure should be up to userspace.  The
> real "fix" is to disable the quirk in QEMU.

Yes, and L1, you are right - I thought about nested case, that maybe it is possible
to eliminate it, but you are right, it can't be eliminated.

My reasoning for doing kvm_vm_bugged() (or returning X86EMUL_UNHANDLEABLE even better maybe,
to give userspace a theoretical chance of dealing with it) 

is to make the error at least a bit more visible. 
(I for example thought for a while that there is some memory corrupion in the guest caused by valgrind,
which cause that #PF)


> 
> > 2. Why can't we just emulate the VMCALL/VMMCALL instruction in this case
> > instead of patching? Any technical reasons for not doing this?  Few guests
> > use it so the perf impact should be very small.
> 
> Nested is basically impossible to get right[1][2].  IIRC, calling into
> kvm_emulate_hypercall() from the emulator also gets messy (I think I tried doing
> exactly this at some point).

It could very well be, however if L0's KVM starts to emulate both VMMCALL and VMCALL
instructions (when the quirk is enabled) then it will be the closest to what KVM always did,
and it will not overwrite the guest memory.

About calling into kvm_emulate_hypercall I can expect trouble, but I would be very happy
if you recall which problems did you face.


Note that at least for a nested guest, we can avoid patching right away because both VMMCALL and VMCALL
that are done in nested guest will never need to call kvm_emulate_hypercall().

VMCALL is always intercepted by L1 as defined by VMX spec, while VMMCALL if not intercepted causes #UD
in the guest.

In those cases emulation is very simple.

As for L1, we already have a precedent: #GP is sometimes
emulated as SVM instruction due to the AMD's errata.


Look at gp_interception:

You first decode the instruciton, and if it is VMCALL, then call the kvm_emulate_hypercall()
This way there is no recursive emulator call.

What do you think?



Best regards,
	Maxim Levitsky


> 
> [1] https://lore.kernel.org/all/Yjyt7tKSDhW66fnR@xxxxxxxxxx 
> [2] https://lore.kernel.org/all/YEZUhbBtNjWh0Zka@xxxxxxxxxx
> 





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux