Re: [PATCH net-next v4 2/4] net: socket: add sockopts blacklist for BPF cgroup hook

Stanislav Fomichev <sdf@xxxxxxxxxx> · Mon, 17 Apr 2023 11:10:17 -0700

On Mon, Apr 17, 2023 at 7:42 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> On Fri, Apr 14, 2023 at 06:55:39PM -0700, Stanislav Fomichev wrote:
> > On 04/13, Stanislav Fomichev wrote:
> > > On Thu, Apr 13, 2023 at 7:38 AM Aleksandr Mikhalitsyn
> > > <aleksandr.mikhalitsyn@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Thu, Apr 13, 2023 at 4:22 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Apr 13, 2023 at 3:35 PM Alexander Mikhalitsyn
> > > > > <aleksandr.mikhalitsyn@xxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > During work on SO_PEERPIDFD, it was discovered (thanks to Christian),
> > > > > > that bpf cgroup hook can cause FD leaks when used with sockopts which
> > > > > > install FDs into the process fdtable.
> > > > > >
> > > > > > After some offlist discussion it was proposed to add a blacklist of
> > > > >
> > > > > We try to replace this word by either denylist or blocklist, even in changelogs.
> > > >
> > > > Hi Eric,
> > > >
> > > > Oh, I'm sorry about that. :( Sure.
> > > >
> > > > >
> > > > > > socket options those can cause troubles when BPF cgroup hook is enabled.
> > > > > >
> > > > >
> > > > > Can we find the appropriate Fixes: tag to help stable teams ?
> > > >
> > > > Sure, I will add next time.
> > > >
> > > > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
> > > >
> > > > I think it's better to add Stanislav Fomichev to CC.
> > >
> > > Can we use 'struct proto' bpf_bypass_getsockopt instead? We already
> > > use it for tcp zerocopy, I'm assuming it should work in this case as
> > > well?
> >
> > Jakub reminded me of the other things I wanted to ask here bug forgot:
> >
> > - setsockopt is probably not needed, right? setsockopt hook triggers
> >   before the kernel and shouldn't leak anything
> > - for getsockopt, instead of bypassing bpf completely, should we instead
> >   ignore the error from the bpf program? that would still preserve
>
> That's fine by me as well.
>
> It'd be great if the net folks could tell Alex how they would want this
> handled.

Doing the bypass seems fine with me for now. If we ever decide that
fd-based optvals are worth inspecting in bpf, we can lift that bypass.

> >   the observability aspect
>
> Please see for more details
> https://lore.kernel.org/lkml/20230411-nudelsalat-spreu-3038458f25c4@brauner

Thanks for the context. Yeah, sockopts are being used for a lot of
interesting things :-(

> > - or maybe we can even have a per-proto bpf_getsockopt_cleanup call that
> >   gets called whenever bpf returns an error to make sure protocols have
> >   a chance to handle that condition (and free the fd)
>
> Installing an fd into an fdtable makes it visible to userspace at which
> point calling close_fd() is doable but an absolute last resort and
> generally a good indicator of misdesign. If the bpf hook wants to make
> decisions based on the file then it should receive a struct
> file, not an fd.

SG! Then let's not over-complicate it for now and do a simple bypass.