On Mon, Jan 10, 2022 at 6:11 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote: > > On 11/1/2022 2:13 am, Jim Mattson wrote: > > On Sun, Jan 9, 2022 at 10:23 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote: > >> > >> On 9/1/2022 9:23 am, Jim Mattson wrote: > >>> On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote: > >>>> > >>>> On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > >>>>> > >>>>> On 12/10/21 20:25, Jim Mattson wrote: > >>>>>> In the long run, I'd like to be able to override this system-wide > >>>>>> setting on a per-VM basis, for VMs that I trust. (Of course, this > >>>>>> implies that I trust the userspace process as well.) > >>>>>> > >>>>>> How would you feel if we were to add a kvm ioctl to override this > >>>>>> setting, for a particular VM, guarded by an appropriate permissions > >>>>>> check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)? > >>>>> > >>>>> What's the rationale for guarding this with a capability check? IIRC > >>>>> you don't have such checks for perf_event_open (apart for getting kernel > >>>>> addresses, which is not a problem for virtualization). > >>>> > >>>> My reasoning was simply that for userspace to override a mode 0444 > >>>> kernel module parameter, it should have the rights to reload the > >>>> module with the parameter override. I wasn't thinking specifically > >>>> about PMU capabilities. > >> > >> Do we have a precedent on any module parameter rewriting for privileger ? > >> > >> A further requirement is whether we can dynamically change this part of > >> the behaviour when the guest is already booted up. > >> > >>> > >>> Assuming that we trust userspace to decide whether or not to expose a > >>> virtual PMU to a guest (as we do on the Intel side), perhaps we could > >>> make use of the existing PMU_EVENT_FILTER to give us per-VM control, > >>> rather than adding a new module parameter for per-host control. If > >> > >> Various granularities of control are required to support vPMU production > >> scenarios, including per-host, per-VM, and dynamic-guest-alive control. > >> > >>> userspace calls KVM_SET_PMU_EVENT_FILTER with an action of > >>> KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could > >>> just disable the virtual PMU for that VM. > >> > >> AMD will also have "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]". > > > > Where do you see this? Revision 3.33 (November 2021) of the AMD APM, > > volume 3, only goes as high as CPUID Fn8000_0021. > > Try APM Revision: 4.04 (November 2021), page 1849/3273, > "CPUID Fn8000_0022_EBX Extended Performance Monitoring and Debug". Is this a public document? > Given the current ambiguity in this revision, the AMD folks will reveal more > details bout this field in the next revision. > > > > >>> > >>> Today, the semantics of an empty allow list are quite different from > >>> the proposed pmuv module parameter being false. However, it should be > >>> an easy conversion. Would anyone be concerned about changing the > >>> current semantics of an empty allow list? Is there a need for > >>> disabling PMU virtualization for legacy userspace implementations that > >>> can't be modified to ask for an empty allow list? > >>> > >> > >> AFAI, at least one user-space agent has integrated with it plus additional > >> "action"s. > >> > >> Once the API that the kernel presents to user space has been defined, > >> it's best not to change it and instead fall into remorse. > > > > Okay. > > > > I propose the following: > > 1) The new module parameter should apply to Intel as well as AMD, for > > situations where userspace is not trusted. > > 2) If the module parameter allows PMU virtualization, there should be > > a new KVM_CAP whereby userspace can enable/disable PMU virtualization. > > (Since you require a dynamic toggle, and there is a move afoot to > > refuse guest CPUID changes once a guest is running, this new KVM_CAP > > is needed on Intel as well as AMD). > > Both hands in favour. Do you need me as a labourer, or you have a ready-made one ? We could split the work. Since (1) is a modification of the change you proposed in this thread, perhaps you could apply it to both AMD and Intel in v2? We can find someone for (2). > > 3) If the module parameter does not allow PMU virtualization, there > > should be no userspace override, since we have no precedent for > > authorizing that kind of override. > > Uh, I thought you (Google) had a lot of these (interesting) use cases internally. We have modified some module parameters so that they can be changed at runtime, but we don't have any concept of a privileged userspace overriding a module parameter restriction. > > > >> "But I am not a decision maker. " :D > >> > >> Thanks, > >> Like Xu > >>