On Thu, Dec 07, 2023 at 10:25:09PM +0100, Christian Brauner wrote: > > If these concerns are correct > > So, ok. I misremebered this. The scenario I had been thinking of is > basically the following. > > We have a thread-group with thread-group leader 1234 and a thread with > 4567 in that thread-group. Assume current thread-group leader is tsk1 > and the non-thread-group leader is tsk2. tsk1 uses struct pid *tg_pid > and tsk2 uses struct pid *t_pid. The struct pids look like this after > creation of both thread-group leader tsk1 and thread tsk2: > > TGID 1234 TID 4567 > tg_pid[PIDTYPE_PID] = tsk1 t_pid[PIDTYPE_PID] = tsk2 > tg_pid[PIDTYPE_TGID] = tsk1 t_pid[PIDTYPE_TGID] = NULL > > IOW, tsk2's struct pid has never been used as a thread-group leader and > thus PIDTYPE_TGID is NULL. Now assume someone does create pidfds for > tsk1 and for tsk2: > > tg_pidfd = pidfd_open(tsk1) t_pidfd = pidfd_open(tsk2) > -> tg_pidfd->private_data = tg_pid -> t_pidfd->private_data = t_pid > > So we stash away struct pid *tg_pid for a pidfd_open() on tsk1 and we > stash away struct pid *t_pid for a pidfd_open() on tsk2. > > If we wait on that task via P_PIDFD we get: > > /* waiting through pidfd */ > waitid(P_PIDFD, tg_pidfd) waitid(P_PIDFD, t_pidfd) > tg_pid[PIDTYPE_TGID] == tsk1 t_pid[PIDTYPE_TGID] == NULL > => succeeds => fails > > Because struct pid *tg_pid is used a thread-group leader struct pid we > can wait on that tsk1. But we can't via the non-thread-group leader > pidfd because the struct pid *t_pid has never been used as a > thread-group leader. > > Now assume, t_pid exec's and the struct pids are transfered. IIRC, we > get: > > tg_pid[PIDTYPE_PID] = tsk2 t_pid[PIDTYPE_PID] = tsk1 > tg_pid[PIDTYPE_TGID] = tsk2 t_pid[PIDTYPE_TGID] = NULL > > If we wait on that task via P_PIDFD we get: > > /* waiting through pidfd */ > waitid(P_PIDFD, tg_pidfd) waitid(P_PIDFD, t_pid) > tg_pid[PIDTYPE_TGID] == tsk2 t_pid[PIDTYPE_TGID] == NULL > => succeeds => fails > > Which is what we want. So effectively this should all work and I > misremembered the struct pid linkage. So afaict we don't even have a > problem here which is great. It sounds like we need some tests for waitpid() directly though, to ensure the semantics stay stable. I can add those and send a v3, assuming the location of do_notify_pidfd() looks ok to you in v2: https://lore.kernel.org/all/20231207170946.130823-1-tycho@tycho.pizza/ Tycho