On Sat, Mar 06, 2021, Paolo Bonzini wrote: > On 06/03/21 02:39, Sean Christopherson wrote: > > Unless KVM (L0) knowingly wants to override L1, e.g. KVM_GUESTDBG_* cases, KVM > > shouldn't do a damn thing except forward the exception to L1 if L1 wants the > > exception. > > > > ud_interception() and gp_interception() do quite a bit before forwarding the > > exception, and in the case of #UD, it's entirely possible the #UD will never get > > forwarded to L1. #GP is even more problematic because it's a contributory > > exception, and kvm_multiple_exception() is not equipped to check and handle > > nested intercepts before vectoring the exception, which means KVM will > > incorrectly escalate a #GP->#DF and #GP->#DF->Triple Fault instead of exiting > > to L1. That's a wee bit problematic since KVM also has a soon-to-be-fixed bug > > where it kills L1 on a Triple Fault in L2... > > I agree with the #GP problem, but this is on purpose. For example, if L1 > CPUID has MOVBE and it is being emulated via #UD, L1 would be right to set > MOVBE in L2's CPUID and expect it not to cause a #UD. The opposite is also true, since KVM has no way of knowing what CPU model L1 has exposed to L2. Though admittedly hiding MOVBE is a rather contrived case. But, the other EmulateOnUD instructions that don't have an intercept condition, SYSENTER, SYSEXIT, SYSCALL, and VMCALL, are also suspect. SYS* will mostly do the right thing, though it's again technically possible that KVM will do the wrong thing since KVM doesn't know L2's CPU model. VMCALL is also probably ok in most scenarios, but patching L2's code from L0 KVM is sketchy. > The same is true for the VMware #GP interception case. I highly doubt that will ever work out as intended for the modified IO #GP behavior. The only way emulating #GP in L2 is correct if L1 wants to pass through the capabilities to L2, i.e. the I/O access isn't intercepted by L1. That seems unlikely. If the I/O is is intercepted by L1, bypassing the IOPL and TSS-bitmap checks is wrong and will cause L1 to emulate I/O for L2 userspace that should never be allowed. Odds are there isn't a corresponding emulated port in L1, i.e. there's no major security flaw, but it's far from good behavior. I can see how some of the instructions will kinda sorta work, but IMO forwading them to L1 is much safer, even if it means that L1 will see faults that should be impossible. At least for KVM-on-KVM, those spurious faults are benign since L1 likely also knows how to emulate the #UD and #GP instructions.