Re: [PATCH RFC 06/10] pidfs: allow to retrieve exit information

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 02, 2025 at 06:21:49PM +0100, Oleg Nesterov wrote:
> On 03/02, Christian Brauner wrote:
> >
> > On Sun, Mar 02, 2025 at 04:53:46PM +0100, Oleg Nesterov wrote:
> > > On 02/28, Christian Brauner wrote:
> > > >
> > > > Some tools like systemd's jounral need to retrieve the exit and cgroup
> > > > information after a process has already been reaped.
> > >               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >
> > > But unless I am totally confused do_exit() calls pidfd_exit() even
> > > before exit_notify(), the exiting task is not even zombie yet. It
> > > will reaped only when it passes exit_notify() and its parent does
> > > wait().
> >
> > The overall goal is that it's possible to retrieve exit status and
> > cgroupid even if the task has already been reaped.
> 
> OK, please see below...
> 
> > It's intentionally placed before exit_notify(), i.e., before the task is
> > a zombie because exit_notify() wakes pidfd-pollers. Ideally, pidfd
> > pollers would be woken and then could use the PIDFD_GET_INFO ioctl to
> > retrieve the exit status.
> 
> This was more a less clear to me. But this doesn't match the "the task has
> already been reaped" goal above...
> 
> > It would however be fine to place it into exit_notify() if it's a better
> > fit there. If you have a preference let me know.
> >
> > I don't see a reason why seeing the exit status before that would be an
> > issue.
> 
> The problem is that it is not clear how can we do this correctly.
> Especialy considering the problem with exec...
> 
> > > But what if this file was created without PIDFD_THREAD? If another
> > > thread does exit_group(1) after that, the process's exit code is
> > > 1 << 8, but it can't be retrieved.
> >
> > Yes, I had raised that in an off-list discussion about this as well and
> > was unsure what the cleanest way of dealing with this would be.
> 
> I am not sure too, but again, please see below.
> 
> > > Now, T is very much alive, but pidfs_i(inode)->exit_info != NULL.
> 
> ...
> 
> > What's the best way of handling the de_thread() case? Would moving this
> > into exit_notify() be enough where we also handle
> > PIDFD_THREAD/~PIDFD_THREAD waking?
> 
> I don't think that moving pidfd_exit() into exit_notify() can solve any
> problem.
> 
> But what if we move pidfd_exit() into release_task() paths? Called when
> the task is reaped by the parent/debugger, or if a sub-thread auto-reaps.
> 
> Can the users of pidfd_info(PIDFD_INFO_EXIT) rely on POLLHUP from
> release_task() -> detach_pid() -> __change_pid(new => NULL) ?

Ok, so:

release_task()
-> __exit_signal()
   -> detach_pid()
      -> __change_pid()

That sounds good. So could we do something like:

diff --git a/kernel/exit.c b/kernel/exit.c
index cae475e7858c..66bb5c53454f 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -127,8 +127,10 @@ static void __unhash_process(struct task_struct *p, bool group_dead)
 {
        nr_threads--;
        detach_pid(p, PIDTYPE_PID);
+       pidfs_exit(p); // record exit information for individual thread
        if (group_dead) {
                detach_pid(p, PIDTYPE_TGID);
+               pidfs_exit(p); // record exit information for thread-group leader
                detach_pid(p, PIDTYPE_PGID);
                detach_pid(p, PIDTYPE_SID);

I know, as written this won't work but I'm just trying to get the idea
across of recording exit information for both the individual thread and
the thread-group leader in __unhash_process().

That should tackle both problems, i.e., recording exit information for
both thread and thread-group leader as well as exec?




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux