Re: [PATCH bpf-next] bpf: abstract loop unrolling pragmas in BPF selftests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 2/7/24 2:12 AM, Jose E. Marchesi wrote:
>> Some BPF tests use loop unrolling compiler pragmas that are clang
>> specific and not supported by GCC.  These pragmas, along with their
>> GCC equivalences are:
>>
>>    #pragma clang loop unroll_count(N)
>>    #pragma GCC unroll N
>>
>>    #pragma clang loop unroll(full)
>>    #pragma GCC unroll 65534
>>
>>    #pragma clang loop unroll(disable)
>>    #pragma GCC unroll 1
>>
>>    #pragma unroll [aka #pragma clang loop unroll(enable)]
>>    There is no GCC equivalence, and it seems to me that this clang
>>    pragma may be only useful when building without -funroll-loops to
>>    enable the optimization in particular loops.  In GCC -funroll-loops
>>    is enabled with -O2 and higher.  If this is also true in clang,
>>    perhaps these pragmas in selftests are redundant?
>
> You are right, at -O2 level, loop unrolling is enabled by default.
> So I think '#pragma unroll' can be removed since gcc also has
> loop unrolling enabled by default at -O2.
>
> Your patch has a conflict with latest bpf-next. Please rebase it
> on top of bpf-next, remove '#pragma unroll' support and resubmit.
> Thanks!

Note profiler.inc.h contains code like:

  #ifdef UNROLL
  	__pragma_loop_unroll
  #endif
 	for (int i = 0; i < ARRAY_SIZE(arr_struct->array); i++) {

And then it is inluded by several test programs, which define (or not)
UNROLL:

profiler1.c:

  #define UNROLL
  #include "profiler.inc.h"

profiler2.c:

  /* undef #define UNROLL */
  #include "profiler.inc.h"

In contrast, in pyperf.h or strobemeta.h we find code like:

  #ifdef NO_UNROLL
  	__pragma_loop_no_unroll
  #endif /* NO_UNROLL */
  	for (int i = 0; i < STROBE_MAX_STRS; ++i) {

And then programs including it define NO_UNROLL to disable unrolling.

If -funroll-oops is enabled with -O2 and BPF programs are always built
with -O2, then not defining UNROLL for profiler.inc.h, seems like
basically a no-op to me, because unrolling will still happen. This is
assuming that #pragma unroll in clang doesn't activates more aggressive
inlining.

>>
>> This patch adds a new header progs/bpf_compiler.h that defines the
>> following macros, which correspond to each pair of compiler-specific
>> pragmas above:
>>
>>    __pragma_loop_unroll_count(N)
>>    __pragma_loop_unroll_full
>>    __pragma_loop_no_unroll
>>    __pragma_loop_unroll
>>
>> The selftests using loop unrolling pragmas are then changed to include
>> the header and use these macros in place of the explicit pragmas.
>>
>> Tested in bpf-next master.
>> No regressions.
>>
>> Signed-off-by: Jose E. Marchesi <jose.marchesi@xxxxxxxxxx>
>> Cc: Yonghong Song <yhs@xxxxxxxx>
>> Cc: Eduard Zingerman <eddyz87@xxxxxxxxx>
>> Cc: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx>
>> Cc: david.faust@xxxxxxxxxx
>> Cc: cupertino.miranda@xxxxxxxxxx
>> ---
>>   .../selftests/bpf/progs/bpf_compiler.h        | 33 +++++++++++++++++++
>>   tools/testing/selftests/bpf/progs/iters.c     |  5 +--
>>   tools/testing/selftests/bpf/progs/loop4.c     |  4 ++-
>>   .../selftests/bpf/progs/profiler.inc.h        | 17 +++++-----
>>   tools/testing/selftests/bpf/progs/pyperf.h    |  7 ++--
>>   .../testing/selftests/bpf/progs/strobemeta.h  | 18 +++++-----
>>   .../selftests/bpf/progs/test_cls_redirect.c   |  5 +--
>>   .../selftests/bpf/progs/test_lwt_seg6local.c  |  6 ++--
>>   .../selftests/bpf/progs/test_seg6_loop.c      |  4 ++-
>>   .../selftests/bpf/progs/test_skb_ctx.c        |  4 ++-
>>   .../selftests/bpf/progs/test_sysctl_loop1.c   |  6 ++--
>>   .../selftests/bpf/progs/test_sysctl_loop2.c   |  6 ++--
>>   .../selftests/bpf/progs/test_sysctl_prog.c    |  6 ++--
>>   .../selftests/bpf/progs/test_tc_tunnel.c      |  4 ++-
>>   tools/testing/selftests/bpf/progs/test_xdp.c  |  3 +-
>>   .../selftests/bpf/progs/test_xdp_loop.c       |  3 +-
>>   .../selftests/bpf/progs/test_xdp_noinline.c   |  5 +--
>>   .../selftests/bpf/progs/xdp_synproxy_kern.c   |  6 ++--
>>   .../testing/selftests/bpf/progs/xdping_kern.c |  3 +-
>>   19 files changed, 103 insertions(+), 42 deletions(-)
>>   create mode 100644 tools/testing/selftests/bpf/progs/bpf_compiler.h
>>
>> diff --git a/tools/testing/selftests/bpf/progs/bpf_compiler.h b/tools/testing/selftests/bpf/progs/bpf_compiler.h
>> new file mode 100644
>> index 000000000000..a7c343dc82e6
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/bpf_compiler.h
>> @@ -0,0 +1,33 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef __BPF_COMPILER_H__
>> +#define __BPF_COMPILER_H__
>> +
>> +#define DO_PRAGMA_(X) _Pragma(#X)
>> +
>> +#if __clang__
>> +#define __pragma_loop_unroll DO_PRAGMA_(clang loop unroll(enable))
>> +#else
>> +/* In GCC -funroll-loops, which is enabled with -O2, should have the
>> +   same impact than the loop-unroll-enable pragma above.  */
>> +#define __pragma_loop_unroll
>> +#endif
>> +
>> +#if __clang__
>> +#define __pragma_loop_unroll_count(N) DO_PRAGMA_(clang loop unroll_count(N))
>> +#else
>> +#define __pragma_loop_unroll_count(N) DO_PRAGMA_(GCC unroll N)
>> +#endif
>> +
>> +#if __clang__
>> +#define __pragma_loop_unroll_full DO_PRAGMA_(clang loop unroll(full))
>> +#else
>> +#define __pragma_loop_unroll_full DO_PRAGMA_(GCC unroll 65534)
>> +#endif
>> +
>> +#if __clang__
>> +#define __pragma_loop_no_unroll DO_PRAGMA_(clang loop unroll(disable))
>> +#else
>> +#define __pragma_loop_no_unroll DO_PRAGMA_(GCC unroll 1)
>> +#endif
>> +
>> +#endif
>> diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c
>> index 225f02dd66d0..3db416606f2f 100644
>> --- a/tools/testing/selftests/bpf/progs/iters.c
>> +++ b/tools/testing/selftests/bpf/progs/iters.c
>> @@ -5,6 +5,7 @@
>>   #include <linux/bpf.h>
>>   #include <bpf/bpf_helpers.h>
>>   #include "bpf_misc.h"
>> +#include "bpf_compiler.h"
>>     #define ARRAY_SIZE(x) (int)(sizeof(x) / sizeof((x)[0]))
>>   @@ -183,7 +184,7 @@ int iter_pragma_unroll_loop(const void *ctx)
>>   	MY_PID_GUARD();
>>     	bpf_iter_num_new(&it, 0, 2);
>> -#pragma nounroll
>> +	__pragma_loop_no_unroll
>>   	for (i = 0; i < 3; i++) {
>>   		v = bpf_iter_num_next(&it);
>>   		bpf_printk("ITER_BASIC: E3 VAL: i=%d v=%d", i, v ? *v : -1);
>> @@ -238,7 +239,7 @@ int iter_multiple_sequential_loops(const void *ctx)
>>   	bpf_iter_num_destroy(&it);
>>     	bpf_iter_num_new(&it, 0, 2);
>> -#pragma nounroll
>> +	__pragma_loop_no_unroll
>>   	for (i = 0; i < 3; i++) {
>>   		v = bpf_iter_num_next(&it);
>>   		bpf_printk("ITER_BASIC: E3 VAL: i=%d v=%d", i, v ? *v : -1);
>
> [...]




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux