The following commit has been merged into the perf/core branch of tip: Commit-ID: f4b07fd62d4d11d57a15cb4ae01b3833282eb8f6 Gitweb: https://git.kernel.org/tip/f4b07fd62d4d11d57a15cb4ae01b3833282eb8f6 Author: Namhyung Kim <namhyung@xxxxxxxxxx> AuthorDate: Sun, 16 Mar 2025 23:17:45 -07:00 Committer: Ingo Molnar <mingo@xxxxxxxxxx> CommitterDate: Mon, 17 Mar 2025 08:31:03 +01:00 perf/core: Use POLLHUP for pinned events in error Pinned performance events can enter an error state when they fail to be scheduled in the context due to a failed constraint or some other conflict or condition. In error state these events won't generate any samples anymore and are silently ignored until they are recovered by PERF_EVENT_IOC_ENABLE, or the condition can also change so that they can be scheduled in. Tooling should be allowed to know about the state change, but currently there's no mechanism to notify tooling when events enter an error state. One way to do this is to issue a POLLHUP event to poll(2) to handle this. Reading events in an error state would return 0 (EOF) and it matches to the behavior of POLLHUP according to the man page. Tooling should remove the fd of the event from pollfd after getting POLLHUP, otherwise it'll be returned repeatedly. [ mingo: Clarified the changelog ] Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Link: https://lore.kernel.org/r/20250317061745.1777584-1-namhyung@xxxxxxxxxx --- kernel/events/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/kernel/events/core.c b/kernel/events/core.c index 2533fc3..ace1bcc 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -3984,6 +3984,11 @@ static int merge_sched_in(struct perf_event *event, void *data) if (event->attr.pinned) { perf_cgroup_event_disable(event, ctx); perf_event_set_state(event, PERF_EVENT_STATE_ERROR); + + if (*perf_event_fasync(event)) + event->pending_kill = POLL_HUP; + + perf_event_wakeup(event); } else { struct perf_cpu_pmu_context *cpc = this_cpc(event->pmu_ctx->pmu); @@ -5925,6 +5930,10 @@ static __poll_t perf_poll(struct file *file, poll_table *wait) if (is_event_hup(event)) return events; + if (unlikely(READ_ONCE(event->state) == PERF_EVENT_STATE_ERROR && + event->attr.pinned)) + return events; + /* * Pin the event->rb by taking event->mmap_mutex; otherwise * perf_event_set_output() can swizzle our rb and make us miss wakeups.
![]() |