On Wed, Oct 04, 2023, Sean Christopherson wrote: > Thinking about this more, what if we do a blend of KVM's FPU swapping and debug > register swapping? > > A. Load guest PMU state in vcpu_enter_guest() after IRQs are disabled > B. Put guest PMU state (and load host state) in vcpu_enter_guest() before IRQs > are enabled, *if and only if* the current CPU has one or perf events that > wants to use the hardware PMU > C. Put guest PMU state at vcpu_put() > D. Add a perf callback that is invoked from IRQ context when perf wants to > configure a new PMU-based events, *before* actually programming the MSRs, > and have KVM's callback put the guest PMU state > > If there are host perf events that want to use the PMU, then KVM will swap fairly > aggressively and the "downtime" of the host perf events will be limited to the > small window around VM-Enter/VM-Exit. > > If there are no such host events, KVM will swap on the first entry to the guest, > and keep the guest PMU loaded until the vCPU is put. > > The perf callback in (D) would allow perf to program system-wide events on all > CPUs without clobbering guest PMU state. > > I think that would make everyone happy. As long as our hosts don't create perf > events, then we get the "swap as little as possible" behavior without significantly > impacting the host's ability to utilize perf. If our host screws up and creates > perf events on CPUs that are running vCPUs, then the degraded vCPU performance is > on us. > > Rough sketch below, minus the perf callback or any of actual swapping logic. Another reason to go for an approach that doesn't completely kill off host PMU usage: just because we don't plan on enable perf events in *production*, there will undoubtedly be times where we want to enable perf events to debug issues (outside of prod) in the host kernel/KVM that affect VMs with a passthrough PMU. So I'll add a self-NAK to the idea of completely disabling the host PMU, I think that would burn us quite badly at some point.