Re: [PATCH bpf-next 2/6] bpf, net: rework cookie generator as per-cpu one

Jakub Kicinski <kuba@xxxxxxxxxx> · Fri, 25 Sep 2020 08:00:20 -0700

On Fri, 25 Sep 2020 00:03:14 +0200 Daniel Borkmann wrote:
> static inline u64 gen_cookie_next(struct gen_cookie *gc)
> {
>          u64 val;
> 
>          if (likely(this_cpu_inc_return(*gc->level_nesting) == 1)) {

Is this_cpu_inc() in itself atomic?

Is there a comparison of performance of various atomic ops and locking
somewhere? I wonder how this scheme would compare to a using a cmpxchg.

>                  u64 *local_last = this_cpu_ptr(gc->local_last);
> 
>                  val = *local_last;
>                  if (__is_defined(CONFIG_SMP) &&
>                      unlikely((val & (COOKIE_LOCAL_BATCH - 1)) == 0)) {

Can we reasonably assume we won't have more than 4k CPUs and just
statically divide this space by encoding CPU id in top bits?

>                          s64 next = atomic64_add_return(COOKIE_LOCAL_BATCH,
>                                                         &gc->shared_last);
>                          val = next - COOKIE_LOCAL_BATCH;
>                  }
>                  val++;
>                  if (unlikely(!val))
>                          val++;
>                  *local_last = val;
>          } else {
>                  val = atomic64_add_return(COOKIE_LOCAL_BATCH,
>                                            &gc->shared_last);
>          }
>          this_cpu_dec(*gc->level_nesting);
>          return val;
> }