Re: [PATCH bpf-next 1/4] xdp: Support specifying expected existing program when attaching XDP

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Fri, 27 Mar 2020 16:00:47 -0700

On Fri, Mar 27, 2020 at 01:06:46PM +0100, Toke Høiland-Jørgensen wrote:
> Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:
> 
> > On Thu, Mar 26, 2020 at 01:35:13PM +0100, Toke Høiland-Jørgensen wrote:
> >> 
> >> Additionally, in the case where there is *not* a central management
> >> daemon (i.e., what I'm implementing with libxdp), this would be the flow
> >> implemented by the library without bpf_link:
> >> 
> >> 1. Query kernel for current BPF prog loaded on $IFACE
> >> 2. Sanity-check that this program is a dispatcher program installed by
> >>    libxdp
> >> 3. Create a new dispatcher program with whatever changes we want to do
> >>    (such as adding another component program).
> >> 4. Atomically replace the old program with the new one using the netlink
> >>    API in this patch series.
> >
> > in this model what stops another application that is not using libdispatcher to
> > nuke dispatcher program ?
> 
> Nothing. But nothing is stopping it from issuing 'ip link down' either -
> an application with CAP_NET_ADMIN is implicitly trusted to be
> well-behaved. This patch series is just adding the kernel primitive that
> enables applications to be well-behaved. I consider it an API bug-fix.

I think what you're proposing is not a fix, but a band-aid.
And from what I can read in this thread you remain unconvinced that
you will hit exactly the same issues we're describing.
We hit them already and you will hit them a year from now.
Simply because fb usage of all parts of bpf are about 3-4 years ahead
of everyone else.
I'm trying to convince you that your libxdp will be in much better
shape a year from now. It will be prepared for a situation when
other libxdp clones exist and are trying to do the same.
While you're saying:
"let me shot myself in the foot. I know what I'm doing. I'll be fine".
I know you will not be. And soon enough you'll come back proposing
locking, id, owner apis for xdp.

> >> Whereas with bpf_link, it would be:
> >> 
> >> 1. Find the pinned bpf_link for $IFACE (e.g., load from
> >>    /sys/fs/bpf/iface-links/$IFNAME).
> >> 2. Query kernel for current BPF prog linked to $LINK
> >> 3. Sanity-check that this program is a dispatcher program installed by
> >>    libxdp
> >> 4. Create a new dispatcher program with whatever changes we want to do
> >>    (such as adding another component program).
> >> 5. Atomically replace the old program with the new one using the
> >>    LINK_UPDATE bpf() API.
> >
> > whereas here dispatcher program is only accessible to libdispatcher.
> > Instance of bpffs needs to be known to libdispatcher only.
> > That's the ownership I've been talking about.
> >
> > As discussed early we need a way for _human_ to nuke dispatcher program,
> > but such api shouldn't be usable out of application/task.
> 
> As long as there is this kind of override in place, I'm not actually
> fundamentally opposed to the concept of bpf_link for XDP, as an
> additional mechanism. What I'm opposed to is using bpf_link as a reason
> to block this series.
> 
> In fact, a way to implement the "human override" you mention, could be
> to reuse the mechanism implemented in this series: If the EXPECTED_FD
> passed via netlink is a bpf_link FD, that could be interpreted as an
> override by the kernel.

That's not "human override". You want to use expected_fd in libxdp.
That's not human. That's any 'yum install firewall' will be nuking
the bpf_link and careful orchestration of our libxdp.

As far as blocking cap_net_admin...
you mentioned that use case is to do:
sudo yum install firewall1
sudo yum install firewall2

when these packages are being installed they will invoke startup scripts
that will install their dispatcher progs on eth0.
Imagine firewall2 is not using correct vestion of libxdp. or buggy one.
all the good work from firewall1 went down the drain.
Note in both cases you only need cap_net_admin to install the prog.
The packages will not be reconfiguring eth0. They need to be told
which interface to apply firewall to. That's all.