On 07/09, Kirill Kolyshkin wrote: > > On Tue, Jul 9, 2024 at 2:43 AM Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > > > > Hi Kir, > > > > On 07/08, Kir Kolyshkin wrote: > > > > > > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64bef697d33b > > > > The changelog says: > > > > pidfd: implement PIDFD_THREAD flag for pidfd_open() > > > > With this flag: > > > > - pidfd_open() doesn't require that the target task must be > > a thread-group leader > > > > - pidfd_poll() succeeds when the task exits and becomes a > > zombie (iow, passes exit_notify()), even if it is a leader > > and thread-group is not empty. > > > > This means that the behaviour of pidfd_poll(PIDFD_THREAD, > > pid-of-group-leader) is not well defined if it races with > > exec() from its sub-thread; pidfd_poll() can succeed or not > > depending on whether pidfd_task_exited() is called before > > or after exchange_tids(). > > > > > +The behavior depends on whether the file descriptor refers > > > +to a process (thread-group leader) or a thread (see > > > +.B PIDFD_THREAD > > > +above): > > > +.RS > > > +.IP \[bu] 3 > > > +For a thread-group leader, the polling task is woken if the > > > +thread-group is empty. In other words, if the thread-group > > > +leader task exits when there are still threads alive in its > > > +thread-group, the polling task will not be woken when the > > > +thread-group leader exits, but rather when the last thread in the > > > +thread-group exits. > > > > so this part is not accurate. > > I copied the above text almost verbatim from the merge commit > description (commit b5683a37c881). And b5683a37c881 says For thread-group leader pidfds ... ... For thread-specific pidfds the polling task is woken if the thread exits. I think that "thread-specific pidfds" means pidfd created with the PIDFD_THREAD flag. > Until it's clarified, let's remove this text, adding a TODO instead. OK. but you can also look at the (trivial) code: static __poll_t pidfd_poll(struct file *file, struct poll_table_struct *pts) { struct pid *pid = pidfd_pid(file); bool thread = file->f_flags & PIDFD_THREAD; struct task_struct *task; __poll_t poll_flags = 0; poll_wait(file, &pid->wait_pidfd, pts); /* * Depending on PIDFD_THREAD, inform pollers when the thread * or the whole thread-group exits. */ guard(rcu)(); task = pid_task(pid, PIDTYPE_PID); if (!task) poll_flags = EPOLLIN | EPOLLRDNORM | EPOLLHUP; else if (task->exit_state && (thread || thread_group_empty(task))) poll_flags = EPOLLIN | EPOLLRDNORM; return poll_flags; } please note that the thread_group_empty() check has no effect if bool thread == f_flags & PIDFD_THREAD is true. In this case pidfd_poll() succeeds if the the target task has already exited (passed exit_notify, so ->exit_state is not zero), no matter if it is a leader or not. Oleg.