Re: [PATCH net-next 0/4] Add getsockopt(SO_PEERCGROUPID) and fdinfo API to retreive socket's peer cgroup id

Christian Brauner <brauner@xxxxxxxxxx> · Tue, 11 Mar 2025 13:02:23 +0100

On Tue, Mar 11, 2025 at 12:33:48AM -0700, Kuniyuki Iwashima wrote:
> From: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@xxxxxxxxxxxxx>
> Date: Sun,  9 Mar 2025 14:28:11 +0100
> > 1. Add socket cgroup id and socket's peer cgroup id in socket's fdinfo
> 
> Why do you want to add yet another racy interface ?
> 
> 
> > 2. Add SO_PEERCGROUPID which allows to retrieve socket's peer cgroup id
> > 3. Add SO_PEERCGROUPID kselftest
> > 
> > Generally speaking, this API allows race-free resolution of socket's peer cgroup id.
> > Currently, to do that SCM_CREDENTIALS/SCM_PIDFD -> pid -> /proc/<pid>/cgroup sequence
> > is used which is racy.
> 
> Few more words about the race (recycling pid ?) would be appreciated.
> 
> I somewhat assumed pid is not recycled until all of its pidfd are
> close()d, but sounds like no ?

No, that would allow starving the kernel of pid numbers.
pidfds don't pin struct task_struct for a multitude of reasons similar
to how cred->peer or scm->pid don't stash a task_struct but a struct pid.

> 
> 
> > 
> > As we don't add any new state to the socket itself there is no potential locking issues
> > or performance problems. We use already existing sk->sk_cgrp_data.
> > 
> > We already have analogical interfaces to retrieve this
> > information:
> > - inet_diag: INET_DIAG_CGROUP_ID
> > - eBPF: bpf_sk_cgroup_id
> > 
> > Having getsockopt() interface makes sense for many applications, because using eBPF is
> > not always an option, while inet_diag has obvious complexety and performance drawbacks
> > if we only want to get this specific info for one specific socket.
> 
> If it's limited to the connect()ed peer, I'd add UNIX_DIAG_CGROUP_ID
> and UNIX_DIAG_PEER_CGROUP_ID instead.  Then also ss can use that easily.