On Tue, Mar 04, 2025 at 10:47:11PM +0100, Oleg Nesterov wrote: > On 03/04, Christian Brauner wrote: > > > > On Tue, Mar 04, 2025 at 06:34:56PM +0100, Oleg Nesterov wrote: > > > On 03/04, Christian Brauner wrote: > > > > > > > > + task = get_pid_task(pid, PIDTYPE_PID); > > > > + if (!task) { > > > > + if (!(mask & PIDFD_INFO_EXIT)) > > > > + return -ESRCH; > > > > + > > > > + if (!current_in_pidns(pid)) > > > > + return -ESRCH; > > > > > > Damn ;) could you explain the current_in_pidns() check to me ? > > > I am puzzled... > > > > So we currently restrict interactions with pidfd by pid namespace > > hierarchy. Meaning that we ensure that the pidfd is part of the caller's > > pid namespace hierarchy. > > Well this is clear... but sorry I still can't understand. Ok, it sounded like you wanted me to explain what current_in_pidns() does not why it's placed where it is placed. :) > > Why do we check current_in_pidns() only if get_pid_task(PIDTYPE_PID) > returns NULL? Because if task != NULL we already catch it kinfo.pid and since we can't skip the call to task_pid_vnr() it seemed redundant to do that check twice. But if we do it before get_task_pid() it makes more sense ofc. > > And, unless (quite possibly) I am totally confused, if task != NULL > but current_in_pidns() would return false, then > > kinfo.pid = task_pid_vnr(task); > > below will set kinfo.pid = 0, and pidfd_info() will return -ESRCH anyway? Yes. > > > So this check is similar to: > > > > pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns) > > { > > struct upid *upid; > > pid_t nr = 0; > > > > if (pid && ns->level <= pid->level) { > > upid = &pid->numbers[ns->level]; > > if (upid->ns == ns) > > nr = upid->nr; > > } > > return nr; > > } > > > > Only that by the time we perform this check the pid numbers have already > > been freed so we can't use that function directly. > > Confused again... Yes, the [u]pid numbers can be already "freed" in that > upid->nr can be already idr_remove()'ed, but > > > But the pid namespace > > hierarchy is still alive as that won't be released until the pidfd has > > put the reference on struct @pid. > > Yes, so I still don't undestand, sorry :/ > > IOW. Why not check current_in_pidns() at the start? and do > task = get_pid_task() later, right before Sure, that's a good suggestion. > > if (!task) > goto copy_out; > > ? > > Oleg. >