On 2022/12/23 04:59, Peter Zijlstra wrote: > On Wed, Dec 21, 2022 at 10:42:39AM +0800, Chengming Zhou wrote: > >>> Does this help? >>> >>> diff --git a/kernel/events/core.c b/kernel/events/core.c >>> index e47914ac8732..bbff551783e1 100644 >>> --- a/kernel/events/core.c >>> +++ b/kernel/events/core.c >>> @@ -12689,7 +12689,8 @@ SYSCALL_DEFINE5(perf_event_open, >>> return event_fd; >>> >>> err_context: >>> - /* event->pmu_ctx freed by free_event() */ >>> + put_pmu_ctx(event->pmu_ctx); >>> + event->pmu_ctx = NULL; /* _free_event() */ >>> err_locked: >>> mutex_unlock(&ctx->mutex); >>> perf_unpin_context(ctx); >> >> Tested-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> >> >> While reviewing the code, I found perf_event_create_kernel_counter() >> has the similar problem in the "err_pmu_ctx" error handling path: > > Right you are, updated the patch, thanks! > >> CPU0 CPU1 >> perf_event_create_kernel_counter() >> // inc ctx refcnt >> find_get_context(task, event) (1) >> >> // inc pmu_ctx refcnt >> pmu_ctx = find_get_pmu_context() >> >> event->pmu_ctx = pmu_ctx >> ... >> goto err_pmu_ctx: >> // dec pmu_ctx refcnt >> put_pmu_ctx(pmu_ctx) (2) >> >> mutex_unlock(&ctx->mutex) >> // dec ctx refcnt >> put_ctx(ctx) >> perf_event_exit_task_context() >> mutex_lock() >> mutex_unlock() >> // last refcnt put >> put_ctx() >> free_event(event) >> if (event->pmu_ctx) // True >> put_pmu_ctx() (3) >> // will access freed pmu_ctx or ctx >> >> if (event->ctx) // False >> put_ctx() > > This doesn't look right; iirc you can hit this without concurrency, > something like so: Right, pmu_ctx UaF can hit without concurrency. But ctx has been created with refcnt == 1, which referenced by the task, so the last refcnt put must be in perf_event_exit_task_context(). Maybe we can improve this, don't let ctx referenced by the task? Then ctx can be freed when all perf_events are removed, instead of having to wait for the task to exit. Maybe I missed something... > > > // note that when getting here, we've not passed > // perf_install_in_context() and event->ctx == NULL. > err_pmu_ctx: > put_pmu_ctx(); > put_ctx(); // last, actually frees ctx This put_ctx() dec refcnt from 2 to 1, perf_event_exit_task_context() will put the last refcnt and free it. > .. > err_alloc: > free_event() > _free_event() > if (event->pmu_ctx) // true, because we forgot to clear > put_pmu_ctx() // hits 0 because double put > // goes and touch epc->ctx and UaF > >