On 2024-10-14 1:49 p.m., Peter Zijlstra wrote: > On Mon, Oct 14, 2024 at 11:51:06AM -0400, Liang, Kan wrote: >> On 2024-10-14 8:03 a.m., Peter Zijlstra wrote: >>> On Thu, Aug 01, 2024 at 04:58:23AM +0000, Mingwei Zhang wrote: >>>> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx> >>>> >>>> There will be a dedicated interrupt vector for guests on some platforms, >>>> e.g., Intel. Add an interface to switch the interrupt vector while >>>> entering/exiting a guest. >>>> >>>> When PMI switch into a new guest vector, guest_lvtpc value need to be >>>> reflected onto HW, e,g., guest clear PMI mask bit, the HW PMI mask >>>> bit should be cleared also, then PMI can be generated continuously >>>> for guest. So guest_lvtpc parameter is added into perf_guest_enter() >>>> and switch_interrupt(). >>>> >>>> At switch_interrupt(), the target pmu with PASSTHROUGH cap should >>>> be found. Since only one passthrough pmu is supported, we keep the >>>> implementation simply by tracking the pmu as a global variable. >>>> >>>> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx> >>>> >>>> [Simplify the commit with removal of srcu lock/unlock since only one pmu is >>>> supported.] >>>> >>>> Signed-off-by: Mingwei Zhang <mizhang@xxxxxxxxxx> >>>> --- >>>> include/linux/perf_event.h | 9 +++++++-- >>>> kernel/events/core.c | 36 ++++++++++++++++++++++++++++++++++-- >>>> 2 files changed, 41 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h >>>> index 75773f9890cc..aeb08f78f539 100644 >>>> --- a/include/linux/perf_event.h >>>> +++ b/include/linux/perf_event.h >>>> @@ -541,6 +541,11 @@ struct pmu { >>>> * Check period value for PERF_EVENT_IOC_PERIOD ioctl. >>>> */ >>>> int (*check_period) (struct perf_event *event, u64 value); /* optional */ >>>> + >>>> + /* >>>> + * Switch the interrupt vectors, e.g., guest enter/exit. >>>> + */ >>>> + void (*switch_interrupt) (bool enter, u32 guest_lvtpc); /* optional */ >>>> }; >>> >>> I'm thinking the guets_lvtpc argument shouldn't be part of the >>> interface. That should be PMU implementation data and accessed by the >>> method implementation. >> >> I think the name of the perf_switch_interrupt() is too specific. >> Here should be to switch the guest context. The interrupt should be just >> part of the context. Maybe a interface as below >> >> void (*switch_guest_ctx) (bool enter, void *data); /* optional */ > > I don't think you even need the data thing. For example, the x86/intel > implementation can just look at a x86_pmu data field to find the magic > value. The new vector is created by KVM, not perf. So it cannot be found in the x86_pmu data field. Perf needs it to update the interrupt vector so the guest PMI can be handled by KVM directly. Thanks, Kan