On Wed, Mar 25, 2020 at 11:42:57AM +0100, Toke Høiland-Jørgensen wrote: > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes: > > > On Tue, Mar 24, 2020 at 12:22:47PM -0700, John Fastabend wrote: > >> > > >> > Well, I wasn't talking about any of those subsystems, I was talking > >> > about networking :) > >> > >> My experience has been that networking in the strict sense of XDP no > >> longer exists on its own without cgroups, flow dissector, sockops, > >> sockmap, tracing, etc. All of these pieces are built, patched, loaded, > >> pinned and otherwise managed and manipulated as BPF objects via libbpf. > >> > >> Because I have all this infra in place for other items its a bit odd > >> imo to drop out of BPF apis to then swap a program differently in the > >> XDP case from how I would swap a program in any other place. I'm > >> assuming ability to swap links will be enabled at some point. > >> > >> Granted it just means I have some extra functions on the side to manage > >> the swap similar to how 'qdisc' would be handled today but still not as > >> nice an experience in my case as if it was handled natively. > >> > >> Anyways the netlink API is going to have to call into the BPF infra > >> on the kernel side for verification, etc so its already not pure > >> networking. > >> > >> > > >> > In particular, networking already has a consistent and fairly > >> > well-designed configuration mechanism (i.e., netlink) that we are > >> > generally trying to move more functionality *towards* not *away from* > >> > (see, e.g., converting ethtool to use netlink). > >> > >> True. But BPF programs are going to exist and interop with other > >> programs not exactly in the networking space. Actually library calls > >> might be used in tracing, cgroups, and XDP side. It gets a bit more > >> interesting if the "same" object file (with some patching) runs in both > >> XDP and sockops land for example. > > > > Thanks John for summarizing it very well. > > It looks to me that netlink proponents fail to realize that "bpf for > > networking" goes way beyond what netlink is doing and capable of doing in the > > future. BPF_*_INET_* progs do core networking without any smell of netlink > > anywhere. "But, but, but, netlink is the way to configure networking"... is > > simply not true. > > That was not what I was saying. Obviously there are other components to > the networking stack than netlink. > > What I'm saying is that netlink is the interface the kernel uses to > *configure network devices*. And that attaching an XDP program is a > network device configuration operation. I mean, it: > > - Relies on the RTNL lock for synchronisation > - Fundamentally alters the flow of network packets on the device > - Potentially has side effects like link up/down, HWQ reconfig etc sure. Attaching a prog to ingress qdisc can be considered a 'configuration' of qdisc because rtnl is needed and what not. That doesn't contradict my point that other apis (not only netlink) take rtnl lock, etc. > I'm wondering if there's a way to reconcile these views? Maybe making > the bpf_link attachment work by passing the link fd to the netlink API? what kind of frankenstein that would be? > That would keep the network interface configuration over netlink, but > would still allow a BPF application to swap out "its" programs via the > bpf_link APIs? It's not about swapping. bpf_link brings ownership concept in the first place. It could be done via bpf syscall, new syscall, netlink, ioctl, you name it. It's all secondary. The key concept is ownership.