Re: [PATCH v2 bpf-next] bpf: devmap: move drop error path to devmap for XDP_REDIRECT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Jesper Dangaard Brouer <brouer@xxxxxxxxxx> writes:

On Sun, 28 Feb 2021 23:27:25 +0100
Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx> wrote:

> >  	drops = bq->count - sent;
> > -out:
> > -	bq->count = 0;
> > +	if (unlikely(drops > 0)) {
> > + /* If not all frames have been > > transmitted, it is our
> > +		 * responsibility to free them
> > +		 */
> > +		for (i = sent; i < bq->count; i++)
> > + > > xdp_return_frame_rx_napi(bq->q[i]); > > + } > > Wouldn't the logic above be the same even w/o the 'if' > condition ? it is just an optimization to avoid the for loop instruction if sent = bq->count

True, and I like this optimization.
It will affect how the code layout is (and thereby I-cache usage).

I'm not sure what I-cache optimization you mean here. Compiling the following C code:

# define unlikely(x)	__builtin_expect(!!(x), 0)

extern void xdp_return_frame_rx_napi(int q);

struct bq_stuff {
   int q[4];
   int count;
};

int test(int sent, struct bq_stuff *bq) {
   int i;
   int drops;

   drops = bq->count - sent;
   if(unlikely(drops > 0))
       for (i = sent; i < bq->count; i++)
           xdp_return_frame_rx_napi(bq->q[i]);

   return 2;
}

with x86_64 gcc 10.2 with -O3 flag in https://godbolt.org/ (which provides the assembly code for different compilers) yields the following assembly:

test:
       mov     eax, DWORD PTR [rsi+16]
       mov     edx, eax
       sub     edx, edi
       test    edx, edx
       jg      .L10
.L6:
       mov     eax, 2
       ret
.L10:
       cmp     eax, edi
       jle     .L6
       push    rbp
       mov     rbp, rsi
       push    rbx
       movsx   rbx, edi
       sub     rsp, 8
.L3:
       mov     edi, DWORD PTR [rbp+0+rbx*4]
       add     rbx, 1
       call    xdp_return_frame_rx_napi
       cmp     DWORD PTR [rbp+16], ebx
       jg      .L3
       add     rsp, 8
       mov     eax, 2
       pop     rbx
       pop     rbp
       ret


When dropping the 'if' completely I get the following assembly output
test:
       cmp     edi, DWORD PTR [rsi+16]
       jge     .L6
       push    rbp
       mov     rbp, rsi
       push    rbx
       movsx   rbx, edi
       sub     rsp, 8
.L3:
       mov     edi, DWORD PTR [rbp+0+rbx*4]
       add     rbx, 1
       call    xdp_return_frame_rx_napi
       cmp     DWORD PTR [rbp+16], ebx
       jg      .L3
       add     rsp, 8
       mov     eax, 2
       pop     rbx
       pop     rbp
       ret
.L6:
       mov     eax, 2
       ret

which exits earlier from the function if 'drops > 0' compared to the original code (the 'for' loop looks a little different, but this shouldn't affect icache).

When removing the 'if' and surrounding the 'for' condition with 'unlikely' statement:

for (i = sent; unlikely(i < bq->count); i++)

I get the following assembly code:

test:
       cmp     edi, DWORD PTR [rsi+16]
       jl      .L10
       mov     eax, 2
       ret
.L10:
       push    rbx
       movsx   rbx, edi
       sub     rsp, 16
.L3:
       mov     edi, DWORD PTR [rsi+rbx*4]
       mov     QWORD PTR [rsp+8], rsi
       add     rbx, 1
       call    xdp_return_frame_rx_napi
       mov     rsi, QWORD PTR [rsp+8]
       cmp     DWORD PTR [rsi+16], ebx
       jg      .L3
       add     rsp, 16
       mov     eax, 2
       pop     rbx
       ret

which is shorter than the other two (one line compared to the second and 7 lines compared the original code) and seems as optimized as the second.


I'm far from being an assembly expert, and I tested a code snippet I wrote myself rather than the kernel's code (for the sake of simplicity only). Can you please elaborate on what makes the original 'if' essential (I took the time to do the assembly tests, please take the time on your side to prove your point, I'm not trying to be grumpy here).

Shay



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux