Re: [PATCH v4 bpf-next 00/14] net: netfilter: add kfunc helper to update ct timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 14, 2022 at 07:53:37AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Tue, Jun 14, 2022 at 03:45:00AM IST, Alexei Starovoitov wrote:
> > On Mon, Jun 13, 2022 at 9:14 AM Kumar Kartikeya Dwivedi
> > <memxor@xxxxxxxxx> wrote:
> > >
> > > On Sun, Jun 12, 2022 at 01:41:17AM IST, Alexei Starovoitov wrote:
> > > > On Thu, May 26, 2022 at 11:34:48PM +0200, Lorenzo Bianconi wrote:
> > > > > Changes since v3:
> > > > > - split bpf_xdp_ct_add in bpf_xdp_ct_alloc/bpf_skb_ct_alloc and
> > > > >   bpf_ct_insert_entry
> > > > > - add verifier code to properly populate/configure ct entry
> > > > > - improve selftests
> > > >
> > > > Kumar, Lorenzo,
> > > >
> > > > are you planning on sending v5 ?
> > > > The patches 1-5 look good.
> > > > Patch 6 is questionable as Florian pointed out.
> > >
> > > Yes, it is almost there.
> > >
> > > > What is the motivation to allow writes into ct->status?
> > >
> > > It will only be allowed for ct from alloc function, after that ct = insert(ct)
> > > releases old one with new read only ct. I need to recheck once again with the
> > > code what other bits would cause problems on insert before I rework and reply.
> >
> > I still don't understand why you want to allow writing after alloc.
> >
> 
> It is just a way to set the status, instead of a helper to set it. Eventually
> before nf_conntrack_hash_check_insert it will still be checked and error
> returned for disallowed bits (e.g. anything in IPS_UNCHANGEABLE_MASK, etc.).
> The current series missed that check.
> 
> Florian is right in that it is a can of worms, but I think we can atleast permit
> things like confirmed, assured, etc. which can also be set when crafting a ct
> using netlink. Both of them can share the same check so it is consistent when
> done from kfunc or netlink side, and any changes internally wrt status bits are
> in sync.
> 
> Anyway, if you don't like the direct write into ct, I can drop it, for now just
> insert a confirmed entry (since this was just for testing). That also means
> patch 3-6 are not strictly needed anymore, so they can be dropped, but I can
> keep them if you want, since they might be useful.
> 
> Florian asked for the pipeline, it is like this:
> 
> ct = bpf_xdp_ct_alloc();
> ct->a = ...; // private ct, not yet visible to anyone but us
> ct->b = ...;
>    or we would now set using helpers
> alloc_ct_set_status(ct, IPS_CONFIRMED);
> alloc_ct_set_timeout(ct, timeout);

In other cases it probably will be useful to write into allocated structs,
but ct's timeout and status fields are a bit too special.
It's probably cleaner to generalize ctnetlink_change_status/ctnetlink_change_timeout
as kfuncs and let progs modify the fields through these two helpers.
Especially since timeout and status&IPS_DYING_BIT are related.

Let's indeed drop 3-6 for now.
Though recognizing 'const' modifier in BTF is useful it creates ambiguity.
Unreferenced ptr_to_btf_id is readonly anyway.
This 'const' would make it readable with normal load vs probe_load, but
that difference is too subtle for users. It looks like normal dereference in C
in both cases. Let's keep existing probe_load only for now.

> ...
> ct = bpf_ct_insert_entry(ct); // old alloc_ct release, new inserted nf_conn returned
> if (!ct)
> 	return -1;
> /* Inserted successfully */
> In the earlier approach this ct->a could now not be written to, as it was
> inserted, instead of allocated ct, which insert function took as arg and
> invalidated, so BPF program held a read only pointer now. If we drop that
> approach all pointers are read only anyway, so writing won't be allowed either.
> 
> > > > The selftest doesn't do that anyway.
> > >
> > > Yes, it wasn't updated, we will do that in v5.
> > >
> > > > Patch 7 (acquire-release pairs) is too narrow.
> > > > The concept of a pair will not work well. There could be two acq funcs and one release.
> > >
> > > That is already handled (you define two pairs: acq1, rel and acq2, rel).
> > > There is also an example: bpf_ct_insert_entry -> bpf_ct_release,
> > > bpf_xdp_ct_lookup -> ct_release.
> >
> > If we can represent that without all these additional btf_id sets
> > it would be much better.
> >
> > > > Please think of some other mechanism. Maybe type based? BTF?
> > > > Or encode that info into type name? or some other way.
> > >
> > > Hmm, ok. I kinda dislike this solution too. The other idea that comes to mind is
> > > encapsulating nf_conn into another struct and returning pointer to that:
> > >
> > >         struct nf_conn_alloc {
> > >                 struct nf_conn ct;
> > >         };
> > >
> > >         struct nf_conn_alloc *bpf_xdp_ct_alloc(...);
> > >         struct nf_conn *bpf_ct_insert_entry(struct nf_conn_alloc *act, ...);
> > >
> > > Then nf_conn_alloc gets a different BTF ID, and hence the type can be matched in
> > > the prototype. Any opinions?
> >
> > Yes. Or maybe typedef ?
> > typedef struct nf_conn nf_conn__alloc;
> > typedef struct nf_conn nf_conn__ro;
> >
> > C will accept silent type casts from one type to another,
> > but BTF type checking can be strict?
> > Not sure. wrapping a struct works too, but extra '.ct' accessor
> > might be annoying? Unless you only use it with container_of().
> > I would prefer double or triple underscore to highlight a flavor.
> > struct nf_conn___init {...}
> > The main benefit, of course, is no need for extra btf_id sets.
> > Different types take care of correct arg passing.
> > In that sense typedef idea doesn't quite work,
> > since BTF checks with typedef would be unnecessarily strict
> > compared to regular C type checking rules. That difference
> > in behavior might bite us later.
> > So let's go with struct wrapping.
> 
> Makes sense, I will go with this. But now if we are not even allowing write to
> such allocated ct (probably only helpers that set some field and check value),
> it can just be an empty opaque struct for the BPF program, while it is still
> a nf_conn in the kernel. There doesn't seem to be much point in wrapping around
> nf_conn when reading from allocated nf_conn isn't going to be of any use.

Let's not make it opaque. struct nf_conn is readonly with probe_load.
No need to disable that access either for allocated nf_conn or inserted.



[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux