Mon, Nov 20, 2023 at 03:23:59PM CET, jhs@xxxxxxxxxxxx wrote: >On Mon, Nov 20, 2023 at 4:39 AM Jiri Pirko <jiri@xxxxxxxxxxx> wrote: >> >> Fri, Nov 17, 2023 at 09:46:11PM CET, jhs@xxxxxxxxxxxx wrote: >> >On Fri, Nov 17, 2023 at 1:37 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: >> >> >> >> Jamal Hadi Salim wrote: >> >> > On Fri, Nov 17, 2023 at 1:27 AM John Fastabend <john.fastabend@xxxxxxxxx> wrote: >> >> > > >> >> > > Jamal Hadi Salim wrote: >> >> [...] >> >> >> >> >> >> I think I'm judging the technical work here. Bullet points. >> >> >> >> 1. p4c-tc implementation looks like it should be slower than a >> >> in terms of pkts/sec than a bpf implementation. Meaning >> >> I suspect pipeline and objects laid out like this will lose >> >> to a BPF program with an parser and single lookup. The p4c-ebpf >> >> compiler should look to create optimized EBPF code not some >> >> emulated switch topology. >> >> >> > >> >The parser is ebpf based. The other objects which require control >> >plane interaction are not - those interact via netlink. >> >We published perf data a while back - presented at the P4 workshop >> >back in April (was in the cover letter) >> >https://github.com/p4tc-dev/docs/blob/main/p4-conference-2023/2023P4WorkshopP4TC.pdf >> >But do note: the correct abstraction is the first priority. >> >Optimization is something we can teach the compiler over time. But >> >even with the minimalist code generation you can see that our approach >> >always beats ebpf in LPM and ternary. The other ones I am pretty sure >> >> Any idea why? Perhaps the existing eBPF maps are not that suitable for >> this kinds of lookups? I mean in theory, eBPF should be always faster. > >We didnt look closely; however, that is not the point - the point is >the perf difference if there is one, is not big with the big win being >proper P4 abstraction. For LPM for sure our algorithmic approach is >better. For ternary the compute intensity in looping is better done in >C. And for exact i believe that ebpf uses better hashing. >Again, that is not the point we were trying to validate in those experiments.. > >On your point of "maps are not that suitable" P4 tables tend to have >very specific attributes (examples associated meters, counters, >default hit and miss actions, etc). > >> >we can optimize over time. >> >Your view of "single lookup" is true for simple programs but if you >> >have 10 tables trying to model a 5G function then it doesnt make sense >> >(and i think the data we published was clear that you gain no >> >advantage using ebpf - as a matter of fact there was no perf >> >difference between XDP and tc in such cases). >> > >> >> 2. p4c-tc control plan looks slower than a directly mmaped bpf >> >> map. Doing a simple update vs a netlink msg. The argument >> >> that BPF can't do CRUD (which we had offlist) seems incorrect >> >> to me. Correct me if I'm wrong with details about why. >> >> >> > >> >So let me see.... >> >you want me to replace netlink and all its features and rewrite it >> >using the ebpf system calls? Congestion control, event handling, >> >arbitrary message crafting, etc and the years of work that went into >> >netlink? NO to the HELL. >> >> Wait, I don't think John suggests anything like that. He just suggests >> to have the tables as eBPF maps. > >What's the difference? Unless maps can do netlink. > >> Honestly, I don't understand the >> fixation on netlink. Its socket messaging, memcpies, processing >> overhead, etc can't keep up with mmaped memory access at scale. Measure >> that and I bet you'll get drastically different results. >> >> I mean, netlink is good for a lot of things, but does not mean it is an >> universal answer to userspace<->kernel data passing. > >Here's a small sample of our requirements that are satisfied by >netlink for P4 object hierarchy[1]: >1. Msg construction/parsing >2. Multi-user request/response messaging What is actually a usecase for having multiple users program p4 pipeline in parallel? >3. Multi-user event subscribe/publish messaging Same here. What is the usecase for multiple users receiving p4 events? > >I dont think i need to provide an explanation on the differences here >visavis what ebpf system calls provide vs what netlink provides and >how netlink is a clear fit. If it is not clear i can give more It is not :/ >breakdown. And of course there's more but above is a good sample. > >The part that is taken for granted is the control plane code and >interaction which is an extremely important detail. P4 Abstraction >requires hierarchies with different compiler generated encoded path >ids etc. This ID mapping gets exacerbated by having multitudes of P4 Why the actual eBFP mapping does not serve the same purpose as ID? ID:mapping 1 :1 ? >programs which have different requirements. Netlink is a natural fit >for this P4 abstraction. Not to mention the netlink/tc path (and in >particular the ID mapping) provides a conduit for offload when that is >needed. >eBPF is just a tool - and the objects are intended to be generic - and >i dont see how any of this could be achieved without retooling to make >it more specific to P4. > >cheers, >jamal > > > >> >> >I should note: that there was an interesting talk at netdevconf 0x17 >> >where the speaker showed the challenges of dealing with ebpf on "day >> >two" - slides or videos are not up yet, but link is: >> >https://netdevconf.info/0x17/sessions/talk/is-scaling-ebpf-easy-yet-a-small-step-to-one-server-but-giant-leap-to-distributed-network.html >> >The point the speaker was making is it's always easy to whip an ebpf >> >program that can slice and dice packets and maybe even flush LEDs but >> >the real work and challenge is in the control plane. I agree with the >> >speaker based on my experiences. This discussion of replacing netlink >> >with ebpf system calls is absolutely a non-starter. Let's just end the >> >discussion and agree to disagree if you are going to keep insisting on >> >that. >> >> >> [...]