Re: [PATCH net-next 01/16] libeth: convert to netmem

Mina Almasry <almasrymina@xxxxxxxxxx> · Tue, 11 Mar 2025 10:43:35 -0700

On Tue, Mar 11, 2025 at 10:23 AM Alexander Lobakin
<aleksander.lobakin@xxxxxxxxx> wrote:
>
> From: Mina Almasry <almasrymina@xxxxxxxxxx>
> Date: Wed, 5 Mar 2025 16:13:32 -0800
>
> > On Wed, Mar 5, 2025 at 8:23 AM Alexander Lobakin
> > <aleksander.lobakin@xxxxxxxxx> wrote:
> >>
> >> Back when the libeth Rx core was initially written, devmem was a draft
> >> and netmem_ref didn't exist in the mainline. Now that it's here, make
> >> libeth MP-agnostic before introducing any new code or any new library
> >> users.
>
> [...]
>
> >>         /* Very rare, but possible case. The most common reason:
> >>          * the last fragment contained FCS only, which was then
> >>          * stripped by the HW.
> >>          */
> >>         if (unlikely(!len)) {
> >> -               libeth_rx_recycle_slow(page);
> >> +               libeth_rx_recycle_slow(netmem);
> >
> > I think before this patch this would have expanded to:
> >
> > page_pool_put_full_page(pool, page, true);
> >
> > But now I think it expands to:
> >
> > page_pool_put_full_netmem(netmem_get_pp(netmem), netmem, false);
> >
> > Is the switch from true to false intentional? Is this a slow path so
> > it doesn't matter?
>
> Intentional. unlikely() means it's slowpath already. libeth_rx_recycle()
> is inline, while _slow() is not. I don't want slowpath to be inlined.
> While I was originally writing the code changed here, I didn't pay much
> attention to that, but since then I altered my approach and now try to
> put anything slow out of line to not waste object code.
>
> Also, some time ago I changed PP's approach to decide whether it can
> recycle buffers directly or not. Previously, if that `allow_direct` was
> false, it was always falling back to ptr_ring, but now if `allow_direct`
> is false, it still checks whether it can be recycled directly.
>

Thanks, yes I forgot about that.

> [...]
>
> >> @@ -3122,16 +3122,20 @@ static u32 idpf_rx_hsplit_wa(const struct libeth_fqe *hdr,
> >>                              struct libeth_fqe *buf, u32 data_len)
> >>  {
> >>         u32 copy = data_len <= L1_CACHE_BYTES ? data_len : ETH_HLEN;
> >> +       struct page *hdr_page, *buf_page;
> >>         const void *src;
> >>         void *dst;
> >>
> >> -       if (!libeth_rx_sync_for_cpu(buf, copy))
> >> +       if (unlikely(netmem_is_net_iov(buf->netmem)) ||
> >> +           !libeth_rx_sync_for_cpu(buf, copy))
> >>                 return 0;
> >>
> >
> > I could not immediately understand why you need a netmem_is_net_iov
> > check here. libeth_rx_sync_for_cpu will delegate to
> > page_pool_dma_sync_netmem_for_cpu which should do the right thing
> > regardless of whether the netmem is a page or net_iov, right? Is this
> > to save some cycles?
>
> If the payload buffer is net_iov, the kernel doesn't have access to it.
> This means, this W/A can't be performed (see memcpy() below the check).
> That's why I exit early explicitly.
> libeth_rx_sync_for_cpu() returns false only if the size is zero.
>
> netmem_is_net_iov() is under unlikely() here, because when using devmem,
> you explicitly configure flow steering, so that only TCP/UDP/whatever
> frames will land on this queue. Such frames are split correctly by
> idpf's HW.
> I need this WA because let's say unfortunately this HW places the whole
> frame to the payload buffer when it's not TCP/UDP/... (see the comment
> above this function).
> For example, it even does so for ICMP, although HW is fully aware of the
> ICMP format. If I was a HW designer of this NIC, I'd instead try putting
> the whole frame to the header buffer, not the payload one. And in
> general, do header split for all known packet types, not just TCP/UDP/..
> But meh... A different story.
>

Makes sense. FWIW:

Reviewed-by: Mina Almasry <almasrymina@xxxxxxxxxx>

-- 
Thanks,
Mina