On Thu, May 2, 2019 at 11:03 AM Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote: > > +Cc Jim > > On Wed, May 01, 2019 at 10:09:19PM -0400, Jintack Lim wrote: > > Even when neither L0 nor L1 configured to trap MSR_FS_BASE writes from > > its own VMs, the current KVM L0 always traps MSR_FS_BASE writes from L2. > > Let's check if both L0 and L1 disabled trap for MSR_FS_BASE for its VMs > > respectively, and let L2 write to MSR_FS_BASE without trap if that's the > > case. > > > > Signed-off-by: Jintack Lim <jintack@xxxxxxxxxxxxxxx> > > --- > > arch/x86/kvm/vmx/nested.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > index 0c601d0..ab85aea 100644 > > --- a/arch/x86/kvm/vmx/nested.c > > +++ b/arch/x86/kvm/vmx/nested.c > > @@ -537,6 +537,7 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, > > */ > > bool pred_cmd = !msr_write_intercepted_l01(vcpu, MSR_IA32_PRED_CMD); > > bool spec_ctrl = !msr_write_intercepted_l01(vcpu, MSR_IA32_SPEC_CTRL); > > + bool fs_base = !msr_write_intercepted_l01(vcpu, MSR_FS_BASE); > > This isn't sufficient as we only fall into this code if L2 is in x2APIC > mode or has accessed the speculation MSRs. The quick fix is to check if > we want to pass through MSR_FS_BASE, but if we're going to open up the > floodgates then we should pass through as many MSRs as possible, e.g. > GS_BASE, KERNEL_GS_BASE, TSC, SYSENTER_*, etc..., and do so using a > generic mechanism. > > That being said, I think there are other reasons why KVM doesn't pass > through MSRs to L2. Unfortunately, I'm struggling to recall what those > reasons are. > > Jim, I'm pretty sure you've looked at this code a lot, do you happen to > know off hand? Is it purely a performance thing to avoid merging bitmaps > on every nested entry, is there a subtle bug/security hole, or is it > simply that no one has ever gotten around to writing the code? > > > > > /* Nothing to do if the MSR bitmap is not in use. */ > > if (!cpu_has_vmx_msr_bitmap() || > > @@ -592,6 +593,12 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, > > } > > } > > > > + if (fs_base) > > + nested_vmx_disable_intercept_for_msr( > > + msr_bitmap_l1, msr_bitmap_l0, > > + MSR_FS_BASE, > > + MSR_TYPE_W); > > This should be MSR_TYPE_RW. Should I explicitly check if L1 disabled intercept for read operations along with write operations? It seems like the current code only checks write operations for spec_ctrl while it disables intercept for RW. This is not the case for pred_cmd, though. I might be missing something here. Could you explain it? > > > + > > if (spec_ctrl) > > nested_vmx_disable_intercept_for_msr( > > msr_bitmap_l1, msr_bitmap_l0, > > -- > > 1.9.1 > > > > >