On Wed, Aug 18, 2021, Robert Hoo wrote: > > Limiting this to VMREAD/VMWRITE means we shouldn't need a bitmap and > > can use a more static lookup, e.g. a switch statement. > Emm, hard for me to choose: > > Your approach sounds more efficient for CPU: Once VMX MSR's updated, no > bother to update the bitmap. Each field's existence check will directly > consult related VMX MSR. Well, the switch statement will be long... How long? Honest question, off the top of my head I don't have a feel for how many fields conditionally exist. > My this implementation: once VMX MSR's updated, the update needs to be > passed to bitmap, this is 1 extra step comparing to aforementioned > above. But, later, when query field existence, especially the those > consulting vm{entry,exit}_ctrl, they usually would have to consult both > MSRs if otherwise no bitmap, and we cannot guarantee if in the future > there's no more complicated dependencies. If using bitmap, this consult > is just 1-bit reading. If no bitmap, several MSR's read and compare > happen. Yes, but the bitmap is per-VM and likely may or may not be cache-hot for back-to-back VMREAD/VMWRITE to different fields, whereas the shadow controls are much more likely to reside somewhere in the caches. > And, VMX MSR --> bitmap, usually happens only once when vCPU model is > settled. But VMRead/VMWrite might happen frequently, depends on guest > itself. I'd rather leave complicated comparison in former than in > later. I'm not terribly concerned about the runtime performance, it's the extra per-VM allocation for something that's not thaaat interesting that I don't like. And for performance, most of the frequently accessed VMCS fields will be shadowed anyways, i.e. won't VM-Exit in the first place. And that brings up another wrinkle. The shadow VMCS bitmaps are global across all VMs, e.g. if the preemption timer is supported in hardware but hidden from L1, then a misbehaving L1 can VMREAD/VMWRITE the field even with this patch. If it was just the preemption timer we could consider disabling shadow VMCS for the VM ifthe timer exists but is hidden from L1, but GUEST_PML_INDEX and GUEST_INTR_STATUS are also conditional :-( Maybe there's a middle ground, e.g. let userspace tell KVM which fields it plans on exposing to L1, use that to build the bitmaps, and disable shadow VMCS if userspace creates VMs that don't match the specified configuration. Burning three more pages per VM isn't very enticing... This is quite the complicated mess for something I'm guessing no one actually cares about. At what point do we chalk this up as a virtualization hole and sweep it under the rug?