On 2024-10-14 7:20 a.m., Peter Zijlstra wrote: > On Thu, Aug 01, 2024 at 04:58:19AM +0000, Mingwei Zhang wrote: >> +void perf_guest_exit(void) >> +{ >> + struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); >> + >> + lockdep_assert_irqs_disabled(); >> + >> + perf_ctx_lock(cpuctx, cpuctx->task_ctx); >> + >> + if (WARN_ON_ONCE(!__this_cpu_read(perf_in_guest))) >> + goto unlock; >> + >> + perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST); >> + ctx_sched_in(&cpuctx->ctx, EVENT_GUEST); >> + perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST); >> + if (cpuctx->task_ctx) { >> + perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST); >> + ctx_sched_in(cpuctx->task_ctx, EVENT_GUEST); >> + perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST); >> + } > > Does this not violate the scheduling order of events? AFAICT this will > do: > > cpu pinned > cpu flexible > task pinned > task flexible > > as opposed to: > > cpu pinned > task pinned > cpu flexible > task flexible > > We have the perf_event_sched_in() helper for this. Yes, we can avoid the sched_in() with EVENT_GUEST flag, then invoke the perf_event_sched_in() helper to do the real schedule. I will do more tests to double check. Thanks, Kan > >> + >> + __this_cpu_write(perf_in_guest, false); >> +unlock: >> + perf_ctx_unlock(cpuctx, cpuctx->task_ctx); >> +} >> +EXPORT_SYMBOL_GPL(perf_guest_exit); >