Re: [PATCH v2 06/15] pidfs: allow to retrieve exit information

Oleg Nesterov <oleg@xxxxxxxxxx> · Tue, 4 Mar 2025 22:47:11 +0100

On 03/04, Christian Brauner wrote:
>
> On Tue, Mar 04, 2025 at 06:34:56PM +0100, Oleg Nesterov wrote:
> > On 03/04, Christian Brauner wrote:
> > >
> > > +	task = get_pid_task(pid, PIDTYPE_PID);
> > > +	if (!task) {
> > > +		if (!(mask & PIDFD_INFO_EXIT))
> > > +			return -ESRCH;
> > > +
> > > +		if (!current_in_pidns(pid))
> > > +			return -ESRCH;
> >
> > Damn ;) could you explain the current_in_pidns() check to me ?
> > I am puzzled...
>
> So we currently restrict interactions with pidfd by pid namespace
> hierarchy. Meaning that we ensure that the pidfd is part of the caller's
> pid namespace hierarchy.

Well this is clear... but sorry I still can't understand.

Why do we check current_in_pidns() only if get_pid_task(PIDTYPE_PID)
returns NULL?

And, unless (quite possibly) I am totally confused, if task != NULL
but current_in_pidns() would return false, then

	kinfo.pid = task_pid_vnr(task);

below will set kinfo.pid = 0, and pidfd_info() will return -ESRCH anyway?

> So this check is similar to:
>
> pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
> {
>         struct upid *upid;
>         pid_t nr = 0;
>
>         if (pid && ns->level <= pid->level) {
>                 upid = &pid->numbers[ns->level];
>                 if (upid->ns == ns)
>                         nr = upid->nr;
>         }
>         return nr;
> }
>
> Only that by the time we perform this check the pid numbers have already
> been freed so we can't use that function directly.

Confused again... Yes, the [u]pid numbers can be already "freed" in that
upid->nr can be already idr_remove()'ed, but

> But the pid namespace
> hierarchy is still alive as that won't be released until the pidfd has
> put the reference on struct @pid.

Yes, so I still don't undestand, sorry :/

IOW. Why not check current_in_pidns() at the start? and do
task = get_pid_task() later, right before

	if (!task)
		goto copy_out;

?

Oleg.