On 9/11/20 1:58 AM, Jesper Dangaard Brouer wrote: > On Thu, 10 Sep 2020 12:35:33 -0600 > David Ahern <dsahern@xxxxxxxxx> wrote: > >> On 9/10/20 11:50 AM, Jesper Dangaard Brouer wrote: >>> Maybe we should change the devmap-prog approach, and run this on the >>> xdp_frame's (in bq_xmit_all() to be precise) . Hangbin's patchset >>> clearly shows that we need this "layer" between running the xdp_prog and >>> the devmap-prog. >> >> I would prefer to leave it in dev_map_enqueue. >> >> The main premise at the moment is that the program attached to the >> DEVMAP entry is an ACL specific to that dev. If the program is going to >> drop the packet, then no sense queueing it. >> >> I also expect a follow on feature will be useful to allow the DEVMAP >> program to do another REDIRECT (e.g., potentially after modifying). It >> is not handled at the moment as it needs thought - e.g., limiting the >> number of iterative redirects. If such a feature does happen, then no >> sense queueing it to the current device. > > It makes a lot of sense to do queuing before redirecting again. The > (hidden) bulking we do at XDP redirect is the primary reason for the > performance boost. We all remember performance difference between > non-map version of redirect (which Toke fixed via always having the > bulking available in net_device->xdp_bulkq). > > In a simple micro-benchmark I bet it will look better running the > devmap-prog right after the xdp_prog (which is what we have today). But > I claim this is the wrong approach, as soon as (1) traffic is more > intermixed, and (2) devmap-prog gets bigger and becomes more specific > to the egress-device (e.g. BPF update constants per egress-device). > When this happens performance suffers, as I-cache and data-access to > each egress-device gets pushed out of cache. (Hint VPP/fd.io approach) > > Queuing xdp_frames up for your devmap-prog makes sense, as these share > common properties. With intermix traffic the first xdp_prog will sort > packets into egress-devices, and then the devmap-prog can operate on > these. The best illustration[1] of this sorting I saw in a Netflix > blogpost[2] about FreeBSD, section "RSS Assisted LRO" (not directly > related, but illustration was good). > > > [1] https://miro.medium.com/max/700/1%2alTGL1_D6hTMEMa7EDV8yZA.png > [2] https://netflixtechblog.com/serving-100-gbps-from-an-open-connect-appliance-cdb51dda3b99 > I understand the theory and testing will need to bear that out. There is a bit of distance (code wise) between where the program is run now and where you want to put it - the conversion from xdp_buff to xdp_frame, the enqueue, and what it means to do a redirect to another device in bq_xmit_all. More importantly though for a redirect is the current xdp_ok_fwd_dev check in __xdp_enqueue which for a redirect could be doing the wrong checks for the wrong device.