Christian, I am already sleeping. I'll try to reply right now, but quite possibly I will need to correct myself tomorrow ;) On 03/02, Christian Brauner wrote: > > Ok, so: > > release_task() > -> __exit_signal() > -> detach_pid() > -> __change_pid() > > That sounds good. So could we do something like: Yes, this is what I meant, except... > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -127,8 +127,10 @@ static void __unhash_process(struct task_struct *p, bool group_dead) > { > nr_threads--; > detach_pid(p, PIDTYPE_PID); > + pidfs_exit(p); // record exit information for individual thread To me it would be better to do this in the caller, release_task(). But this is minor and I won't insist. Please see below. > if (group_dead) { > detach_pid(p, PIDTYPE_TGID); > + pidfs_exit(p); // record exit information for thread-group leader This looks pointless, task_pid(p) is the same. > I know, as written this won't work but I'm just trying to get the idea > across of recording exit information for both the individual thread and > the thread-group leader in __unhash_process(). > > That should tackle both problems, i.e., recording exit information for > both thread and thread-group leader as well as exec? This will fix the problem with mt-exec, but this won't help to discriminate the leader-exit and the-whole-group-exit cases... With this this (or something like this) change pidfd_info() can only report the exit code of the already reaped thread/process, leader or not. I mean... If the leader L exits using sys_exit() and it has the live sub- threads, release_task(L) / __unhash_process(L) will be only called when the last sub-thread exits and it (or debugger) does "goto repeat;" in release_task() to finally reap the leader. IOW. If someone does sys_pidfd_create(group-leader-pid, PIDFD_THREAD), pidfd_info() won't report PIDFD_INFO_EXIT if the leader has exited using sys_exit() before other threads. But perhaps this is fine? Let me repeat, I have no idea how and why people use pidfd ;) Oleg.