On Tue, Dec 14, 2021 at 11:40:26AM +0000, Pavel Begunkov wrote: > On 12/14/21 07:27, Martin KaFai Lau wrote: > > On Sat, Dec 11, 2021 at 07:17:49PM +0000, Pavel Begunkov wrote: > > > cgroup_bpf_enabled_key static key guards from overhead in cases where > > > no cgroup bpf program of a specific type is loaded in any cgroup. Turn > > > out that's not always good enough, e.g. when there are many cgroups but > > > ones that we're interesting in are without bpf. It's seen in server > > > environments, but the problem seems to be even wider as apparently > > > systemd loads some BPF affecting my laptop. > > > > > > Profiles for small packet or zerocopy transmissions over fast network > > > show __cgroup_bpf_run_filter_skb() taking 2-3%, 1% of which is from > > > migrate_disable/enable(), and similarly on the receiving side. Also > > > got +4-5% of t-put for local testing. > > What is t-put? throughput? > > yes > > > Local testing means sending to lo/dummy? > > yes, it was dummy specifically Thanks for confirming. Please also put these details in the commit log. I was slow. With only '%' as a unit, it took me a min to guess what t-put may mean ;) > > [ ... ] > > > > > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h > > > index 11820a430d6c..793e4f65ccb5 100644 > > > --- a/include/linux/bpf-cgroup.h > > > +++ b/include/linux/bpf-cgroup.h > > > @@ -219,11 +219,28 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value); > > > int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, > > > void *value, u64 flags); > > > +static inline bool > > > +__cgroup_bpf_prog_array_is_empty(struct cgroup_bpf *cgrp_bpf, > > > + enum cgroup_bpf_attach_type type) > > Lets remove this. > > > > > +{ > > > + struct bpf_prog_array *array = rcu_access_pointer(cgrp_bpf->effective[type]); > > > + > > > + return array == &empty_prog_array.hdr; > > > +} > > > + > > > +#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \ > > and change cgroup.c to directly use this instead, so > > everywhere holding a fullsock sk will use this instead > > of having two helpers for empty check. > > Why? As mentioned earlier, prefer to have one way to do the same thing for checking with a fullsock. > CGROUP_BPF_TYPE_ENABLED can't be a function atm because of header > dependency hell, and so it'd kill some of typization, which doesn't add > clarity. I didn't mean to change it to a function. I actually think, for the sk context, it should eventually be folded with the existing cgroup_bpf_enabled() macro because those are the tests to ensure there is bpf prog to run before proceeding. Need to audit about the non fullsock case. not sure yet. > And also it imposes some extra overhead to *sockopt using > the first helper directly. I think it is unimportant unless it is measurable in normal use case. > I think it's better with two of them. Ok. I won't insist. There are atype that may not have sk, so a separate inline function for checking emptiness may eventually be useful there. > I could inline the second one, but it wouldn't have been pretty. Leaving CGROUP_BPF_TYPE_ENABLED as macro is fine.