On Sat, Jan 27, 2024 at 11:54:32AM +0100, Oleg Nesterov wrote: > Hi Tycho, > > On 01/26, Tycho Andersen wrote: > > > > On Thu, Jan 25, 2024 at 03:08:31PM +0100, Oleg Nesterov wrote: > > > What do you think? > > > > Thank you, it passes all my tests. > > Great, thanks! > > OK, I'll make v2 on top of the recent > "pidfd: cleanup the usage of __pidfd_prepare's flags" > > but we need to finish our discussion with Christian about the > usage of O_EXCL. > > As for clone(CLONE_PIDFD | CLONE_THREAD), this is trivial but > I think this needs another discussion too, lets do this later. > > > > + /* unnecessary if do_notify_parent() was already called, > > > + we can do better */ > > > + do_notify_pidfd(tsk); > > > > "do better" here could be something like, > > > > [...snip...] > > No, no, please see below. > > For the moment, please forget about PIDFD_THREAD, lets discuss > the current behaviour. > > > but even with that, there's other calls in the tree to > > do_notify_parent() that might double notify. > > Yes, and we can't avoid this. Well, perhaps do_notify_parent() > can do something like > > if (ptrace_reparented()) > do_notify_pidfd(); > > so that only the "final" do_notify_parent() does do_notify_pidfd() > but this needs another discussion and in fact I don't think this > would be right or make much sense. Lets forget this for now. It seems like (and the current pidfd_test enforces for some cases) we want exactly one notification for a task dying. I don't understand how we guarantee this now, with all of these calls. > > This brings up another interesting behavior that I noticed while > > testing this, if you do a poll() on pidfd, followed quickly by a > > pidfd_getfd() on the same thread you just got an event on, you can > > sometimes get an EBADF from __pidfd_fget() instead of the more > > expected ESRCH higher up the stack. > > exit_notify() is called after exit_files(). pidfd_getfd() returns > ESRCH if the exiting thread completes release_task(), otherwise it > returns EBADF because ->files == NULL. This too doesn't really > depend on PIDFD_THREAD. Yup, understood. It just seems like an inconsistency we might want to fix. > > I wonder if it makes sense to abuse ->f_flags to add a PIDFD_NOTIFIED? > > Then we can refuse further pidfd syscall operations in a sane way, and > > But how? We only have "struct pid *", how can we find all files > "attached" to this pid? Yeah, we'd need some other linkage as Christian points out. But if there is a predicate we can write that says whether this task has been notified or not, it's not necessary. I just don't understand what that is. But maybe your patch will make it clearer. Tycho