On Fri, Jul 07, 2023 at 07:24:48PM +0200, Daniel Borkmann wrote: > + > +#define BPF_MPROG_KEEP 0 > +#define BPF_MPROG_SWAP 1 > +#define BPF_MPROG_FREE 2 Please document how this is suppose to be used. Patch 2 is using BPF_MPROG_FREE in tcx_entry_needs_release(). Where most of the code treats BPF_MPROG_SWAP and BPF_MPROG_FREE as equivalent. I can guess what it's for, but a comment would help. > + > +#define BPF_MPROG_MAX 64 we've seen buggy user space attaching thousands of tc progs to the same netdev. I suspect 64 limit will be hit much sooner than expected. > + > +#define bpf_mprog_foreach_tuple(entry, fp, cp, t) \ > + for (fp = &entry->fp_items[0], cp = &entry->parent->cp_items[0];\ > + ({ \ > + t.prog = READ_ONCE(fp->prog); \ > + t.link = cp->link; \ > + t.prog; \ > + }); \ > + fp++, cp++) > + > +#define bpf_mprog_foreach_prog(entry, fp, p) \ > + for (fp = &entry->fp_items[0]; \ > + (p = READ_ONCE(fp->prog)); \ > + fp++) I have similar questions to Stanislav. Looks like update/delete/query of bpf_prog should be protected by an external lock (like RTNL in case of tcx), but what are the life time rules for 'entry'? Looking at patch 2 sch_handle_ingress(): struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress); I suspect the assumption is that bpf_mprog_entry object should be accessed within RCU critical section. Since tc/tcx and XDP run in napi we have RCU protection there. In the future, for cgroups, bpf_prog_run_array_cg() will keep explicit rcu_read_lock() before accessing bpf_mprog_entry, right? And bpf_mprog_commit() assumes that RCU protection. All fine, but we need to document that mprog mechanism is not suitable for sleepable progs. > + if (flags & BPF_F_BEFORE) { > + tidx = bpf_mprog_pos_before(entry, &rtuple); > + if (tidx < -1 || (idx >= -1 && tidx != idx)) { > + ret = tidx < -1 ? tidx : -EDOM; > + goto out; > + } > + idx = tidx; > + } > + if (flags & BPF_F_AFTER) { > + tidx = bpf_mprog_pos_after(entry, &rtuple); > + if (tidx < 0 || (idx >= -1 && tidx != idx)) { tidx < 0 vs tidx < -1 for _after vs _before. Does it have to have this subtle difference? Can _after and _before have the same semantics for return value? > + ret = tidx < 0 ? tidx : -EDOM; > + goto out; > + } > + idx = tidx; > + }