Hi Stan, On Thu, Mar 23, 2023 at 11:31:14AM -0700, Stanislav Fomichev wrote: > On Wed, Mar 22, 2023 at 5:41 PM Daniel Xu <dxu@xxxxxxxxx> wrote: > > > > Hi Florian, Stan, > > > > On Fri, Mar 03, 2023 at 01:27:52AM +0100, Florian Westphal wrote: > > > Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > > > On 03/02, Florian Westphal wrote: > > > > > + struct { > > > > > + __u32 pf; > > > > > + __u32 hooknum; > > > > > + __s32 prio; > > > > > + } netfilter; > > > > > > > > For recent tc BPF program extensions, we've discussed that it might be > > > > better > > > > to have an option to attach program before/after another one in the chain. > > > > So the API essentially would receive a before/after flag + fd/id of the > > > > > > > > Should we do something similar here? See [0] for the original > > > > discussion. > > > > > > > > 0: https://lore.kernel.org/bpf/YzzWDqAmN5DRTupQ@xxxxxxxxxx/ > > > > > > Thanks for the pointer, I will have a look. > > > > > > The above exposes the "prio" of netfilter hooks, so someone > > > that needs their hook to run early on, say, before netfilters > > > nat engine, could just use INT_MIN. > > > > > > We could -- for nf bpf -- make the bpf_link fail if a hook > > > with the same priority already exists to avoid the "undefined > > > behaviour" here (same prio means register order decides what > > > hook function runs first ...). > > > > > > This could be relevant if you have e.g. one bpf program collecting > > > statistics vs. one doing drops. > > > > > > I'll dig though the thread and would try to mimic the tc link > > > mechanism as close as possible. > > > > While I think the direction the TC link discussion took is totally fine, > > TC has the advantage (IIUC) of being a somewhat isolated hook. Meaning > > it does not make sense for a user to mix priority values && before/after > > semantics. > > > > Netfilter is different in that there is by default modules active with > > fixed priority values. So mixing in before/after semantics here could > > get confusing. > > I don't remember the details, so pls correct me, but last time I > looked, this priority was basically an ordering within a hook? Yeah, that is my understanding as well. > And there were a bunch of kernel-hardcoded values. So either that > whole story has to become a UAPI (so the bpf program knows > before/after which kernel hook it has to run), or we need some other > ordering mechanism. I'm not sure what you mean by "whole story" but netfilter kernel modules register via a priority value as well. As well as the modules the kernel ships. So there's that to consider. (I'm not sure what's the story with bpf vs kernel > hooks interop, so maybe it's all moot?) > Am I missing something? Can you share more about why those fixed > priorities are fine? I guess I wouldn't say it's ideal (for all the reasons brought up in the previous discussion), but trying to use before/after semantics here would necessarily pull in a netfilter rework too, no? Or maybe there's some clever way to merge the two worlds and get both subsystems what they want. Thanks, Daniel