On 2024-06-12 7:17 a.m., Peter Zijlstra wrote: > On Tue, Jun 11, 2024 at 09:27:46AM -0400, Liang, Kan wrote: >> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h >> index dd4920bf3d1b..68c8b93c4e5c 100644 >> --- a/include/linux/perf_event.h >> +++ b/include/linux/perf_event.h >> @@ -945,6 +945,7 @@ struct perf_event_context { >> u64 time; >> u64 timestamp; >> u64 timeoffset; >> + u64 timeguest; >> >> /* >> * These fields let us detect when two contexts have both > >> @@ -651,10 +653,26 @@ __perf_update_times(struct perf_event *event, u64 >> now, u64 *enabled, u64 *runnin >> >> static void perf_event_update_time(struct perf_event *event) >> { >> - u64 now = perf_event_time(event); >> + u64 now; >> + >> + /* Never count the time of an active guest into an exclude_guest event. */ >> + if (event->ctx->timeguest && >> + event->pmu->capabilities & PERF_PMU_CAP_PASSTHROUGH_VPMU) { >> + /* >> + * If a guest is running, use the timestamp while entering the guest. >> + * If the guest is leaving, reset the event timestamp. >> + */ >> + if (__this_cpu_read(perf_in_guest)) >> + event->tstamp = event->ctx->timeguest; >> + else >> + event->tstamp = event->ctx->time; >> + return; >> + } >> >> + now = perf_event_time(event); >> __perf_update_times(event, now, &event->total_time_enabled, >> &event->total_time_running); >> + >> event->tstamp = now; >> } > > So I really don't like this much, An alternative way I can imagine may maintain a dedicated timeline for the PASSTHROUGH PMUs. For that, we probably need two new timelines for the normal events and the cgroup events. That sounds too complex. > and AFAICT this is broken. At the very > least this doesn't work right for cgroup events, because they have their > own timeline. I think we just need a new start time for an event. So we may use perf_event_time() to replace the ctx->time for the cgroup events. diff --git a/kernel/events/core.c b/kernel/events/core.c index 019c237dd456..6c46699c6752 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -665,7 +665,7 @@ static void perf_event_update_time(struct perf_event *event) if (__this_cpu_read(perf_in_guest)) event->tstamp = event->ctx->timeguest; else - event->tstamp = event->ctx->time; + event->tstamp = perf_event_time(event); return; } > > Let me have a poke... Sure. Thanks, Kan