On Tue, Jul 23, 2024 at 10:23:37PM +0800, Sean Christopherson wrote: > On Tue, Jul 23, 2024, Hou Wenlong wrote: > > On Tue, Jul 23, 2024 at 07:59:22AM +0800, Sean Christopherson wrote: > > > --- > > > > > > I found this by inspection when backporting Hou's change to an internal kernel. > > > I don't love piggybacking Intel's "you can't touch these special MSRs" behavior, > > > but ignoring the userspace MSR filters is far worse IMO. E.g. if userspace is > > > denying access to an MSR in order to reduce KVM's attack surface, letting L1 > > > sneak in reads/writes through VM-Enter/VM-Exit completely circumvents the > > > filters. > > > > > > Documentation/virt/kvm/api.rst | 19 ++++++++++++++++--- > > > arch/x86/include/asm/kvm_host.h | 2 ++ > > > arch/x86/kvm/vmx/nested.c | 12 ++++++------ > > > arch/x86/kvm/x86.c | 6 ++++-- > > > 4 files changed, 28 insertions(+), 11 deletions(-) > > > > > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > > > index 8e5dad80b337..e6b1e42186f3 100644 > > > --- a/Documentation/virt/kvm/api.rst > > > +++ b/Documentation/virt/kvm/api.rst > > > @@ -4226,9 +4226,22 @@ filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes > > > an error. > > > > > > .. warning:: > > > - MSR accesses as part of nested VM-Enter/VM-Exit are not filtered. > > > - This includes both writes to individual VMCS fields and reads/writes > > > - through the MSR lists pointed to by the VMCS. > > > + MSR accesses that are side effects of instruction execution (emulated or > > > + native) are not filtered as hardware does not honor MSR bitmaps outside of > > > + RDMSR and WRMSR, and KVM mimics that behavior when emulating instructions > > > + to avoid pointless divergence from hardware. E.g. RDPID reads MSR_TSC_AUX, > > > + SYSENTER reads the SYSENTER MSRs, etc. > > > + > > > + MSRs that are loaded/stored via dedicated VMCS fields are not filtered as > > > + part of VM-Enter/VM-Exit emulation. > > > + > > > + MSRs that are loaded/store via VMX's load/store lists _are_ filtered as part > > > + of VM-Enter/VM-Exit emulation. If an MSR access is denied on VM-Enter, KVM > > > + synthesizes a consistency check VM-Exit(EXIT_REASON_MSR_LOAD_FAIL). If an > > > + MSR access is denied on VM-Exit, KVM synthesizes a VM-Abort. In short, KVM > > > + extends Intel's architectural list of MSRs that cannot be loaded/saved via > > > + the VM-Enter/VM-Exit MSR list. It is platform owner's responsibility to > > > + to communicate any such restrictions to their end users. > > > > > Do we also need to modify the statement before this warning? > > Yeah, that's a good idea. > > While you're here, did you have a use case that is/was affected by the current > VM-Enter/VM-Exit vs. MSR filtering behavior? > Uh, nested virtualization is not usually used in our enviroment and I didn't test it with MSR filtering before. I found a conflict between MSR filtering and RDPID instruction emulation when testing the x86 emulator for PVM, so I sent this patch. At that time, I was thinking that the state transitions (including VM-Enter/VM-Exit) would also be affected by MSR filtering, so I mentioned it in the commit message. > > Since the behaviour is different from RDMSR/WRMSR emulation case. > > > > ``` > > if an MSR access is denied by userspace the resulting KVM behavior depends on > > whether or not KVM_CAP_X86_USER_SPACE_MSR's KVM_MSR_EXIT_REASON_FILTER is > > enabled. If KVM_MSR_EXIT_REASON_FILTER is enabled, KVM will exit to userspace > > on denied accesses, i.e. userspace effectively intercepts the MSR access. > > ```