On Fri, Aug 7, 2020 at 3:49 PM Alexander Graf <graf@xxxxxxxxxx> wrote: > > On 07.08.20 23:16, Jim Mattson wrote: > > This patch doesn't attempt to solve your problem, "tell user space > > about unhandled MSRs." It attempts to solve our problem, "tell user > > space about a specific set of MSRs, even if kvm learns to handle them > > in the future." This is, in fact, what we really wanted to do when > > Peter Hornyack implemented the "tell user space about unhandled MSRs" > > changes in 2015. We just didn't realize it at the time. > > Ok, let's take a step back then. What exactly are you trying to solve > and which MSRs do you care about? Our userspace handles a handful of MSRs, including some of the synthetic PV time-related MSRs (both kvm and Hyper-V), which we want to exit to userspace even though kvm thinks it knows what to do with them. We also have a range of Google-specific MSRs that kvm will probably never know about. And, finally, we find that as time goes on, there are situations where we want to add support for a new MSR in userspace because we can roll out a new userspace more quickly than a new kernel. An example of this last category is MSR_IA32_ARCH_CAPABILITIES. We don't necessarily want to cede control of such MSRs to kvm as soon as it knows about them. For instance, we may want to wait until the new kernel reaches its rollback horizon. There may be a few more MSRs we handle in userspace, but I think that covers the bulk of them. I don't believe we have yet seen a case where we wanted to take control of an MSR that kvm didn't intercept itself. That part of Aaron's patch may be overkill, but it gives the API clean semantics. It seems a bit awkward to have a userspace MSR list where accesses to some of the MSRs exit to userspace and accesses to others do not. (I believe that your implementation of the allow list suffers from this awkwardness, though I haven't yet had time to review it in depth.) > My main problem with a deny list approach is that it's (practically) > impossible to map the allow list use case onto it. It is however trivial > to only deny a few select MSRs explicitly with an allow list approach. I > don't want to introduce yet another ABI to KVM in 2 years to then have > both :). I agree in part. A deny list does not support your use case well. But does an allow list support it that much better? I suspect that the allow list that specifies "every MSR that kvm knows about today" is far from trivial to construct, especially if you consider different microarchitectures and different VM configurations (e.g. vPMU vs. no vPMU). Can we back up and ask what it is that you want to accomplish? If you simply want ignore_msrs to be per-VM, wouldn't it be easier to add an ioctl to simply set a boolean that potentially overrides the module parameter, and then modify the ignore_msrs code to consult the per-VM boolean? Getting back to one of the issues I raised earlier: your "exit instead of #GP" patch exits to userspace on both unknown MSRs and illegal WRMSR values. If userspace wants to implement "ignore MSRs" for a particular VM, it's absolutely trivial when the reason for the exit is "unknown MSR." However, it becomes more complicated when the reason for the exit is "#GP." Now, userspace has to know which MSRs are known to kvm before it can decide whether or not the access should #GP. I hope this doesn't come across sounding antagonistic. Like you, I want to avoid revisiting this design a few years down the road. We're already in that boat with the 2015 "exit on unknown MSR" change!