On Mon, Aug 24, 2020 at 11:09 AM Alexander Graf <graf@xxxxxxxxxx> wrote: > > > > On 24.08.20 19:23, Jim Mattson wrote: > > > > On Sun, Aug 23, 2020 at 6:35 PM Alexander Graf <graf@xxxxxxxxxx> wrote: > >> > >> > >> > >> On 21.08.20 19:58, Jim Mattson wrote: > >>> > >>> On Thu, Aug 20, 2020 at 3:55 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote: > >>>> > >>>> On Thu, Aug 20, 2020 at 2:59 PM Alexander Graf <graf@xxxxxxxxxx> wrote: > >>>> > >>>>> Do we really need to do all of this dance of differentiating in kernel > >>>>> space between an exit that's there because user space asked for the exit > >>>>> and an MSR access that would just generate a #GP? > >>>>> > >>>>> At the end of the day, user space *knows* which MSRs it asked to > >>>>> receive. It can filter for them super easily. > >>>> > >>>> If no one else has an opinion, I can let this go. :-) > >>>> > >>>> However, to make the right decision in kvm_emulate_{rdmsr,wrmsr} > >>>> (without the unfortunate before and after checks that Aaron added), > >>>> kvm_{get,set}_msr should at least distinguish between "permission > >>>> denied" and "raise #GP," so I can provide a deny list without asking > >>>> for userspace exits on #GP. > >>> > >>> Actually, I think this whole discussion is moot. You no longer need > >>> the first ioctl (ask for a userspace exit on #GP). The allow/deny list > >>> is sufficient. Moreover, the allow/deny list checks can be in > >>> kvm_emulate_{rdmsr,wrmsr} before the call to kvm_{get,set}_msr, so we > >>> needn't be concerned with distinguishable error values either. > >>> > >> > >> I also care about cases where I allow in-kernel handling, but for > >> whatever reason there still would be a #GP injected into the guest. I > >> want to record those events and be able to later have data that tell me > >> why something went wrong. > >> > >> So yes, for your use case you do not care about the distinction between > >> "deny MSR access" and "report invalid MSR access". However, I do care :). > > > > In that case, I'm going to continue to hold a hard line on the > > distinction between a #GP for an invalid MSR access and the #GP for an > > unknown MSR. If, for instance, you wanted to implement ignore_msrs in > > userspace, as you've proposed in the past, this would be extremely > > helpful. Without it, userspace gets an exit because (1) the MSR access > > isn't in the allow list, (2) the MSR access is invalid, or (3) the MSR > > is unknown to kvm. As you've pointed out, it is easy for userspace to > > distinguish (1) from the others, since it provided the allow/deny list > > in the first place. But how do you distinguish (2) from (3) without > > replicating the logic in the kernel? > > > >> My stance on this is again that it's trivial to handle a few invalid MSR > >> #GPs from user space and just not report anything. It should come at > >> almost negligible performance cost, no? > > > > Yes, the performance cost should be negligible, but what is the point? > > We're trying to design a good API here, aren't we? > > > >> As for your argumentation above, we have a second call chain into > >> kvm_{get,set}_msr from the x86 emulator which you'd also need to cover. > >> > >> One thing we could do I guess is to add a parameter to ENABLE_CAP on > >> KVM_CAP_X86_USER_SPACE_MSR so that it only bounces on certain return > >> values, such as -ENOENT. I still fail to see cases where that's > >> genuinely beneficial though. > > > > I'd like to see two completely independent APIs, so that I can just > > request a bounce on -EPERM through a deny list. I think it's useful > > Where would that bounce to? Which user space event does that trigger? > Yet another one? Wouldn't 4 exit reasons just for MSR traps be a bit > much? :) All of the exits are either KVM_EXIT_X86_RDMSR or KVM_EXIT_X86_WRMSR. Or, we could put the direction in the msr struct and just have one exit reason. > > to distinguish between -ENOENT and -EINVAL, but I have no issues wih > > both causing an exit to userspace, if userspace has requested exits on > > MSR #GPs. > > So imagine we took the first argument to ENABLE_CAP as filter: > > (1<<0) REPORT_ENOENT > (1<<1) REPORT_EINVAL > (1<<2) REPORT_EPERM > (1<<31) REPORT_ANY > > Then we also add the reason to the kvm_run exit response and user space > can differentiate easily between the different events. I think this works well. I still have to call both APIs to satisfy my use case, but I'm willing to cave on that request. (I just realized that there is a very good use case for an allow/deny list *without* exits to userspace: prohibiting kvm from doing cross-vendor MSR emulation.)