On Thu, May 02, 2019 at 10:59:16AM -0700, Jim Mattson wrote: > On Thu, May 2, 2019 at 8:03 AM Sean Christopherson > <sean.j.christopherson@xxxxxxxxx> wrote: > > > That being said, I think there are other reasons why KVM doesn't pass > > through MSRs to L2. Unfortunately, I'm struggling to recall what those > > reasons are. > > > > Jim, I'm pretty sure you've looked at this code a lot, do you happen to > > know off hand? Is it purely a performance thing to avoid merging bitmaps > > on every nested entry, is there a subtle bug/security hole, or is it > > simply that no one has ever gotten around to writing the code? > > I'm not aware of any subtle bugs or security holes. If L1 changes the > VMCS12 MSR permission bitmaps while L2 is running, behavior is > unlikely to match hardware, but this is clearly in "undefined > behavior" territory anyway. IIRC, the posted interrupt structures are > the only thing hanging off of the VMCS that can legally be modified > while a logical processor with that VMCS active is in VMX non-root > operation. Cool, thanks! > I agree that FS_BASE, GS_BASE, and KERNEL_GS_BASE, at the very least, > are worthy of special treatment. Fortunately, their permission bits > are all in the same quadword. Some of the others, like the SYSENTER > and SYSCALL MSRs are rarely modified by a typical (non-hypervisor) OS. > For nested performance at levels deeper than L2, they might still > prove interesting. Agreed on the *_BASE MSRs. Rarely written MSRs should be intercepted so that the nested VM-Exit path doesn't need to read them from vmcs02 on every exit (WIP). I'm playing with the nested code right now and one of the things I'm realizing is that KVM spends an absurd amount of time copying data to/from VMCSes for fields that are almost never accessed by L0 or L1. > Basically, I think no one has gotten around to writing the code.