On Fri, Oct 4, 2019 at 8:58 AM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > Always use init_net flow dissector BPF program if it's attached and fall > back to the per-net namespace one. Also, deny installing new programs if > there is already one attached to the root namespace. > Users can still detach their BPF programs, but can't attach any > new ones (-EEXIST). > > Cc: Petar Penkov <ppenkov@xxxxxxxxxx> > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx> > --- Looks good, but see my note below. Regardless: Acked-by: Andrii Nakryiko <andriin@xxxxxx> > Documentation/bpf/prog_flow_dissector.rst | 3 ++ > net/core/flow_dissector.c | 42 ++++++++++++++++++++--- > 2 files changed, 41 insertions(+), 4 deletions(-) > > diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst > index a78bf036cadd..4d86780ab0f1 100644 > --- a/Documentation/bpf/prog_flow_dissector.rst > +++ b/Documentation/bpf/prog_flow_dissector.rst > @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel > C-based implementation can export. Notable example is single VLAN (802.1Q) > and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys`` > for a set of information that's currently can be exported from the BPF context. > + > +When BPF flow dissector is attached to the root network namespace (machine-wide > +policy), users can't override it in their child network namespaces. > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c > index 7c09d87d3269..9821e730fc70 100644 > --- a/net/core/flow_dissector.c > +++ b/net/core/flow_dissector.c > @@ -114,19 +114,50 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr, > { > struct bpf_prog *attached; > struct net *net; > + int ret = 0; > > net = current->nsproxy->net_ns; > mutex_lock(&flow_dissector_mutex); > + > + if (net == &init_net) { > + /* BPF flow dissector in the root namespace overrides > + * any per-net-namespace one. When attaching to root, > + * make sure we don't have any BPF program attached > + * to the non-root namespaces. > + */ > + struct net *ns; > + > + for_each_net(ns) { > + if (net == &init_net) > + continue; You don't need this condition, if something is attached to init_net, you will return -EEXIST anyway. Or is this a performance optimization? > + > + if (rcu_access_pointer(ns->flow_dissector_prog)) { > + ret = -EEXIST; > + goto out; > + } > + } > + } else { > + /* Make sure root flow dissector is not attached > + * when attaching to the non-root namespace. > + */ > + nit: extra empty line > + if (rcu_access_pointer(init_net.flow_dissector_prog)) { > + ret = -EEXIST; > + goto out; > + } > + } > + > attached = rcu_dereference_protected(net->flow_dissector_prog, > lockdep_is_held(&flow_dissector_mutex)); > if (attached) { > /* Only one BPF program can be attached at a time */ > - mutex_unlock(&flow_dissector_mutex); > - return -EEXIST; > + ret = -EEXIST; > + goto out; > } > rcu_assign_pointer(net->flow_dissector_prog, prog); > +out: > mutex_unlock(&flow_dissector_mutex); > - return 0; > + return ret; > } > > int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr) > @@ -910,7 +941,10 @@ bool __skb_flow_dissect(const struct net *net, > WARN_ON_ONCE(!net); > if (net) { > rcu_read_lock(); > - attached = rcu_dereference(net->flow_dissector_prog); > + attached = rcu_dereference(init_net.flow_dissector_prog); > + > + if (!attached) > + attached = rcu_dereference(net->flow_dissector_prog); > > if (attached) { > struct bpf_flow_keys flow_keys; > -- > 2.23.0.581.g78d2f28ef7-goog >