Re: [PATCH net-next v8 00/15] Introducing P4TC

Jiri Pirko <jiri@xxxxxxxxxxx> · Mon, 20 Nov 2023 10:39:04 +0100

Fri, Nov 17, 2023 at 09:46:11PM CET, jhs@xxxxxxxxxxxx wrote:
>On Fri, Nov 17, 2023 at 1:37 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote:
>>
>> Jamal Hadi Salim wrote:
>> > On Fri, Nov 17, 2023 at 1:27 AM John Fastabend <john.fastabend@xxxxxxxxx> wrote:
>> > >
>> > > Jamal Hadi Salim wrote:

[...]

>>
>> I think I'm judging the technical work here. Bullet points.
>>
>> 1. p4c-tc implementation looks like it should be slower than a
>>    in terms of pkts/sec than a bpf implementation. Meaning
>>    I suspect pipeline and objects laid out like this will lose
>>    to a BPF program with an parser and single lookup. The p4c-ebpf
>>    compiler should look to create optimized EBPF code not some
>>    emulated switch topology.
>>
>
>The parser is ebpf based. The other objects which require control
>plane interaction are not - those interact via netlink.
>We published perf data a while back - presented at the P4 workshop
>back in April (was in the cover letter)
>https://github.com/p4tc-dev/docs/blob/main/p4-conference-2023/2023P4WorkshopP4TC.pdf
>But do note: the correct abstraction is the first priority.
>Optimization is something we can teach the compiler over time. But
>even with the minimalist code generation you can see that our approach
>always beats ebpf in LPM and ternary. The other ones I am pretty sure

Any idea why? Perhaps the existing eBPF maps are not that suitable for
this kinds of lookups? I mean in theory, eBPF should be always faster.

>we can optimize over time.
>Your view of "single lookup" is true for simple programs but if you
>have 10 tables trying to model a 5G function then it doesnt make sense
>(and i think the data we published was clear that you gain no
>advantage using ebpf - as a matter of fact there was no perf
>difference between XDP and tc in such cases).
>
>> 2. p4c-tc control plan looks slower than a directly mmaped bpf
>>    map. Doing a simple update vs a netlink msg. The argument
>>    that BPF can't do CRUD (which we had offlist) seems incorrect
>>    to me. Correct me if I'm wrong with details about why.
>>
>
>So let me see....
>you want me to replace netlink and all its features and rewrite it
>using the ebpf system calls? Congestion control, event handling,
>arbitrary message crafting, etc and the years of work that went into
>netlink? NO to the HELL.

Wait, I don't think John suggests anything like that. He just suggests
to have the tables as eBPF maps. Honestly, I don't understand the
fixation on netlink. Its socket messaging, memcpies, processing
overhead, etc can't keep up with mmaped memory access at scale. Measure
that and I bet you'll get drastically different results.

I mean, netlink is good for a lot of things, but does not mean it is an
universal answer to userspace<->kernel data passing.

>I should note: that there was an interesting talk at netdevconf 0x17
>where the speaker showed the challenges of dealing with ebpf on "day
>two" - slides or videos are not up yet, but link is:
>https://netdevconf.info/0x17/sessions/talk/is-scaling-ebpf-easy-yet-a-small-step-to-one-server-but-giant-leap-to-distributed-network.html
>The point the speaker was making is it's always easy to whip an ebpf
>program that can slice and dice packets and maybe even flush LEDs but
>the real work and challenge is in the control plane. I agree with the
>speaker based on my experiences. This discussion of replacing netlink
>with ebpf system calls is absolutely a non-starter. Let's just end the
>discussion and agree to disagree if you are going to keep insisting on
>that.

[...]