Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP

Toshiaki Makita <toshiaki.makita1@xxxxxxxxx> · Sat, 17 Aug 2019 23:10:10 +0900

On 19/08/17 (土) 0:35:50, Stanislav Fomichev wrote:
On 08/16, Toshiaki Makita wrote:
On 2019/08/16 0:21, Stanislav Fomichev wrote:
On 08/15, Toshiaki Makita wrote:
On 2019/08/15 2:07, Stanislav Fomichev wrote:
On 08/13, Toshiaki Makita wrote:
* Implementation

xdp_flow makes use of UMH to load an eBPF program for XDP, similar to
bpfilter. The difference is that xdp_flow does not generate the eBPF
program dynamically but a prebuilt program is embedded in UMH. This is
mainly because flow insertion is considerably frequent. If we generate
and load an eBPF program on each insertion of a flow, the latency of the
first packet of ping in above test will incease, which I want to avoid.
Can this be instead implemented with a new hook that will be called
for TC events? This hook can write to perf event buffer and control
plane will insert/remove/modify flow tables in the BPF maps (contol
plane will also install xdp program).

Why do we need UMH? What am I missing?

So you suggest doing everything in xdp_flow kmod?
You probably don't even need xdp_flow kmod. Add new tc "offload" mode
(bypass) that dumps every command via netlink (or calls the BPF hook
where you can dump it into perf event buffer) and then read that info
from userspace and install xdp programs and modify flow tables.
I don't think you need any kernel changes besides that stream
of data from the kernel about qdisc/tc flow creation/removal/etc.

My intention is to make more people who want high speed network easily use XDP,
so making transparent XDP offload with current TC interface.

What userspace program would monitor TC events with your suggestion?
Have a new system daemon (xdpflowerd) that is independently
packaged/shipped/installed. Anybody who wants accelerated TC can
download/install it. OVS can be completely unaware of this.

Thanks, but that's what I called an unreliable solution...

ovs-vswitchd? If so, it even does not need to monitor TC. It can
implement XDP offload directly.
(However I prefer kernel solution. Please refer to "About alternative
userland (ovs-vswitchd etc.) implementation" section in the cover letter.)

Also such a TC monitoring solution easily can be out-of-sync with real TC
behavior as TC filter/flower is being heavily developed and changed,
e.g. introduction of TC block, support multiple masks with the same pref, etc.
I'm not sure such an unreliable solution have much value.
This same issue applies to the in-kernel implementation, isn't it?
What happens if somebody sends patches for a new flower feature but
doesn't add appropriate xdp support? Do we reject them?

Why can we accept a patch which breaks other in-kernel subsystem...
Such patches can be applied accidentally but we are supposed to fix such 
problems in -rc phase, aren't we?

Toshiaki Makita