Re: [PATCH RFC v2 bpf-next 1/3] bpf: add bpf_link support for BPF_NETFILTER programs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 24, 2023 at 10:33 AM Daniel Xu <dxu@xxxxxxxxx> wrote:
>
> Hi Stan,
>
> On Thu, Mar 23, 2023 at 11:31:14AM -0700, Stanislav Fomichev wrote:
> > On Wed, Mar 22, 2023 at 5:41 PM Daniel Xu <dxu@xxxxxxxxx> wrote:
> > >
> > > Hi Florian, Stan,
> > >
> > > On Fri, Mar 03, 2023 at 01:27:52AM +0100, Florian Westphal wrote:
> > > > Stanislav Fomichev <sdf@xxxxxxxxxx> wrote:
> > > > > On 03/02, Florian Westphal wrote:
> > > > > > +                 struct {
> > > > > > +                         __u32           pf;
> > > > > > +                         __u32           hooknum;
> > > > > > +                         __s32           prio;
> > > > > > +                 } netfilter;
> > > > >
> > > > > For recent tc BPF program extensions, we've discussed that it might be
> > > > > better
> > > > > to have an option to attach program before/after another one in the chain.
> > > > > So the API essentially would receive a before/after flag + fd/id of the
> > > > >
> > > > > Should we do something similar here? See [0] for the original
> > > > > discussion.
> > > > >
> > > > > 0: https://lore.kernel.org/bpf/YzzWDqAmN5DRTupQ@xxxxxxxxxx/
> > > >
> > > > Thanks for the pointer, I will have a look.
> > > >
> > > > The above exposes the "prio" of netfilter hooks, so someone
> > > > that needs their hook to run early on, say, before netfilters
> > > > nat engine, could just use INT_MIN.
> > > >
> > > > We could -- for nf bpf -- make the bpf_link fail if a hook
> > > > with the same priority already exists to avoid the "undefined
> > > > behaviour" here (same prio means register order decides what
> > > > hook function runs first ...).
> > > >
> > > > This could be relevant if you have e.g. one bpf program collecting
> > > > statistics vs. one doing drops.
> > > >
> > > > I'll dig though the thread and would try to mimic the tc link
> > > > mechanism as close as possible.
> > >
> > > While I think the direction the TC link discussion took is totally fine,
> > > TC has the advantage (IIUC) of being a somewhat isolated hook. Meaning
> > > it does not make sense for a user to mix priority values && before/after
> > > semantics.
> > >
> > > Netfilter is different in that there is by default modules active with
> > > fixed priority values. So mixing in before/after semantics here could
> > > get confusing.
> >
> > I don't remember the details, so pls correct me, but last time I
> > looked, this priority was basically an ordering within a hook?
>
> Yeah, that is my understanding as well.
>
> > And there were a bunch of kernel-hardcoded values. So either that
> > whole story has to become a UAPI (so the bpf program knows
> > before/after which kernel hook it has to run), or we need some other
> > ordering mechanism.
>
> I'm not sure what you mean by "whole story" but netfilter kernel modules
> register via a priority value as well. As well as the modules the kernel
> ships. So there's that to consider.

Sorry for not being clear. What I meant here is that we'd have to
export those existing priorities in the UAPI headers and keep those
numbers stable. Otherwise it seems impossible to have a proper interop
between those fixed existing priorities and new bpf modules?
(idk if that's a real problem or I'm overthinking)

Because the problem as I see it is as follows:
Let's say I want to trigger before/after kernel module X. I need to
know its priority and it has to be stable.
Alternatively, there should be some way to query the priority of
module X so I can do +1/-1 (which is essentially "before X/after X"
semantics).

> (I'm not sure what's the story with bpf vs kernel
> > hooks interop, so maybe it's all moot?)
> > Am I missing something? Can you share more about why those fixed
> > priorities are fine?
>
> I guess I wouldn't say it's ideal (for all the reasons brought up in the
> previous discussion), but trying to use before/after semantics here
> would necessarily pull in a netfilter rework too, no? Or maybe there's
> some clever way to merge the two worlds and get both subsystems what
> they want.

Right, I don't have an answer here, just trying to understand whether
(as a side effect of those patches) we're really making those existing
priorities a UAPI or not :-)

> Thanks,
> Daniel




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux