Re: [PATCH bpf-next 0/3] Introduce pinnable bpf_link kernel abstraction

Toke Høiland-Jørgensen <toke@xxxxxxxxxx> · Wed, 04 Mar 2020 08:47:44 +0100

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:

> On Tue, Mar 03, 2020 at 11:27:13PM +0100, Toke Høiland-Jørgensen wrote:
>> Alexei Starovoitov <ast@xxxxxx> writes:
>> >
>> > Legacy api for tc, xdp, cgroup will not be able to override FD-based
>> > link. For TC it's easy. cls-bpf allows multi-prog, so netlink
>> > adding/removing progs will not be able to touch progs that are
>> > attached via FD-based link.
>> > Same thing for cgroups. FD-based link will be similar to 'multi' mode.
>> > The owner of the link has a guarantee that their program will
>> > stay attached to cgroup.
>> > XDP is also easy. Since it has only one prog. Attaching FD-based link
>> > will prevent netlink from overriding it.
>> 
>> So what happens if the device goes away?
>
> I'm not sure yet whether it's cleaner to make netdev, qdisc, cgroup to be held
> by the link or use notifier approach. There are pros and cons to both.
>
>> > This way the rootlet prog installed by libxdp (let's find a better name
>> > for it) will stay attached.
>> 
>> Dispatcher prog?
>
> would be great, but 'bpf_dispatcher' name is already used in the kernel.
> I guess we can still call the library libdispatcher and dispatcher prog?
> Alternatives:
> libchainer and chainer prog
> libaggregator and aggregator prog?
> libpolicer kinda fits too, but could be misleading.

Of those, I like 'dispatcher' best.

> libxdp is very confusing. It's not xdp specific.

Presumably the parts that are generally useful will just end up in
libbpf (eventually)?

>> > libxdp can choose to pin it in some libxdp specific location, so other
>> > libxdp-enabled applications can find it in the same location, detach,
>> > replace, modify, but random app that wants to hack an xdp prog won't
>> > be able to mess with it.
>> 
>> What if that "random app" comes first, and keeps holding on to the link
>> fd? Then the admin essentially has to start killing processes until they
>> find the one that has the device locked, no?
>
> Of course not. We have to provide an api to make it easy to discover
> what process holds that link and where it's pinned.
> But if we go with notifier approach none of it is an issue.
> Whether target obj is held or notifier is used everything I said before still
> stands. "random app" that uses netlink after libdispatcher got its link FD will
> not be able to mess with carefully orchestrated setup done by
> libdispatcher.

Protecting things against random modification is fine. What I want to
avoid is XDP/tc programs locking the device so an admin needs to perform
extra steps if it is in use when (e.g.) shutting down a device. XDP
should be something any application can use as acceleration, and if it
becomes known as "that annoying thing that locks my netdev", then that
is not going to happen.

> Also either approach will guarantee that infamous message:
> "unregister_netdevice: waiting for %s to become free. Usage count"
> users will never see.
>
>> And what about the case where the link fd is pinned on a bpffs that is
>> no longer available? I.e., if a netdevice with an XDP program moves
>> namespaces and no longer has access to the original bpffs, that XDP
>> program would essentially become immutable?
>
> 'immutable' will not be possible.
> I'm not clear to me how bpffs is going to disappear. What do you mean
> exactly?

# stat /sys/fs/bpf | grep Device
Device: 1fh/31d	Inode: 1013963     Links: 2
# mkdir /sys/fs/bpf/test; ls /sys/fs/bpf
test
# ip netns add test
# ip netns exec test stat /sys/fs/bpf/test
stat: cannot stat '/sys/fs/bpf/test': No such file or directory
# ip netns exec test stat /sys/fs/bpf | grep Device
Device: 3fh/63d	Inode: 12242       Links: 2

It's a different bpffs instance inside the netns, so it won't have
access to anything pinned in the outer one...

-Toke