Re: [PATCH 1/2] KVM: x86: Allow userspace to opt out of hypercall patching

Maxim Levitsky <mlevitsk@xxxxxxxxxx> · Wed, 24 Aug 2022 18:06:05 +0300

On Wed, 2022-08-24 at 14:43 +0000, Sean Christopherson wrote:
> On Wed, Aug 24, 2022, Maxim Levitsky wrote:
> > On Mon, 2022-03-28 at 18:28 +0000, Sean Christopherson wrote:
> > > On Mon, Mar 28, 2022, Oliver Upton wrote:
> > > > While I was looking at #UD under nested for this issue, I noticed:
> > > > 
> > > > Isn't there a subtle inversion on #UD intercepts for nVMX? L1 gets first dibs
> > > > on #UD, even though it is possible that L0 was emulating an instruction not
> > > > present in hardware (like RDPID). If L1 passed through RDPID the #UD
> > > > should not be reflected to L1.
> > > 
> > > Yes, it's a known bug.
> > > 
> > > > I believe this would require that we make the emulator aware of nVMX which
> > > > sounds like a science project on its own.
> > > 
> > > I don't think it would require any new awareness in the emulator proper, KVM
> > > would "just" need to ensure it properly morphs the resulting reflected #UD to a
> > > nested VM-Exit if the emulator doesn't "handle" the #UD.  In theory, that should
> > > Just Work...
> > > 
> > > > Do we write this off as another erratum of KVM's (virtual) hardware on VMX? :)
> > > 
> > > I don't think we write it off entirely, but it's definitely on the backburner
> > > because there are so precious few cases where KVM emulates on #UD.  And for good
> > > reason, e.g. the RDPID case takes an instruction that exists purely to optimize
> > > certain flows and turns them into dreadfully sloooow paths.
> > > 
> > 
> > I noticed that 'fix_hypercall_test' selftest fails if run in a VM. The reason is
> > that L0 patches the hypercall before L1 sees it so it can't really do anything
> > about it.
> > 
> > Do you think we can always stop patching hypercalls for the nested guest regardless
> > of the quirk, or that too will be considered breaking backwards compatability?
> 
> Heh, go run it on Intel, problem solved ;-)
> 
> As discussed last year[*], it's impossible to get this right in all cases, ignoring
> the fact that patching in the first place is arguably wrong.  E.g. if KVM is running
> on AMD hardware and L0 exposes an Intel vCPU to L1, then it sadly becomes KVM's
> responsibility to patch L2 because from L1's perspective, a #UD on Intel's VMCALL
> in L2 is spurious.
> 
> Regardless of what path we take, I do think we should align VMX and SVM on exception
> intercept behavior.

Maybe then we should at least skip the unit test if running nested (should be easy to check the hypervisor
cpuid)?

Oh well, I do understand you that the whole 'patching' thing is one big mess :(

I wonder how hard it will be to ask Qemu to disable this quirk....

Best regards,
	Maxim Levitsky

> 
> [*] https://lore.kernel.org/all/YEZUhbBtNjWh0Zka@xxxxxxxxxx
>