Hi Oleg, On Thu, Jan 25, 2024 at 03:08:31PM +0100, Oleg Nesterov wrote: > What do you think? Thank you, it passes all my tests. > + /* unnecessary if do_notify_parent() was already called, > + we can do better */ > + do_notify_pidfd(tsk); "do better" here could be something like, diff --git a/kernel/exit.c b/kernel/exit.c index efe8f1d3a6af..7e545393f2f5 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -742,6 +742,7 @@ static void exit_notify(struct task_struct *tsk, int group_dead) bool autoreap; struct task_struct *p, *n; LIST_HEAD(dead); + bool needs_notify = true; write_lock_irq(&tasklist_lock); forget_original_parent(tsk, &dead); @@ -756,16 +757,21 @@ static void exit_notify(struct task_struct *tsk, int group_dead) !ptrace_reparented(tsk) ? tsk->exit_signal : SIGCHLD; autoreap = do_notify_parent(tsk, sig); + needs_notify = false; } else if (thread_group_leader(tsk)) { - autoreap = thread_group_empty(tsk) && - do_notify_parent(tsk, tsk->exit_signal); + autoreap = false; + if (thread_group_empty(tsk)) { + autoreap = do_notify_parent(tsk, tsk->exit_signal); + needs_notify = false; + } } else { autoreap = true; } /* unnecessary if do_notify_parent() was already called, we can do better */ - do_notify_pidfd(tsk); + if (needs_notify) + do_notify_pidfd(tsk); if (autoreap) { tsk->exit_state = EXIT_DEAD; but even with that, there's other calls in the tree to do_notify_parent() that might double notify. This brings up another interesting behavior that I noticed while testing this, if you do a poll() on pidfd, followed quickly by a pidfd_getfd() on the same thread you just got an event on, you can sometimes get an EBADF from __pidfd_fget() instead of the more expected ESRCH higher up the stack. I wonder if it makes sense to abuse ->f_flags to add a PIDFD_NOTIFIED? Then we can refuse further pidfd syscall operations in a sane way, and also "do better" above by checking this flag from do_pidfd_notify() before doing it again? Tycho