Re: [RFC PATCH bpf-next 0/2] bpf: Implement shared persistent fast(er) sk_storoage mode

sdf@xxxxxxxxxx · Tue, 24 Aug 2021 09:03:20 -0700

On 08/23, Alexei Starovoitov wrote:
On Mon, Aug 23, 2021 at 05:52:50PM -0400, Hans Montero wrote:
> From: Hans Montero <hjm2133@xxxxxxxxxxxx>
>
> This patch set adds a BPF local storage optimization. The first patch  
adds the
> feature, and the second patch extends the bpf selftests so that the  
feature is
> tested.
>
> We are running BPF programs for each egress packet and noticed that
> bpf_sk_storage_get incurs a significant amount of cpu time. By inlining  
the
> storage into struct sock and accessing that instead of performing a map  
lookup,
> we expect to reduce overhead for our specific use-case.

Looks like a hack to me. Please share the perf numbers and setup details.
I think there should be a different way to address performance concerns
without going into such hacks.

What kind of perf numbers would you like to see? What we see here is
that bpf_sk_storage_get() cycles are somewhere on par with hashtable
lookups (we've moved off of 5-tuple ht lookup to sk_storage). Looking
at the code, it seems it's mostly coming from the following:

  sk_storage = rcu_dereference(sk->sk_bpf_storage);
  sdata = rcu_dereference(local_storage->cache[smap->cache_idx]);
  return sdata->data

We do 3 cold-cache references :-( This is where the idea of inlining
something in the socket itself came from. The RFC is just to present
the case and discuss. I was thinking about doing some kind of
inlining at runtime (and fallback to non-inlined case) but wanted
to start with discussing this compile-time option first.

We can also try to save sdata somewhere in the socket to avoid two
lookups for the cached case, this can potentially save us two  
rcu_dereference's.
Is that something that looks acceptable? I was wondering whether you've
considered any socket storage optimizations on your side?

I can try to set up some office hours to discuss in person if that's
preferred.

> This also has a
> side-effect of persisting the socket storage, which can be beneficial.

Without explicit opt-in such sharing will cause multiple bpf progs to  
corrupt
each other data.

New BPF_F_SHARED_LOCAL_STORAGE flag is here to provide this opt-in.