On Tue, Jun 14, 2022 at 03:45:00AM IST, Alexei Starovoitov wrote: > On Mon, Jun 13, 2022 at 9:14 AM Kumar Kartikeya Dwivedi > <memxor@xxxxxxxxx> wrote: > > > > On Sun, Jun 12, 2022 at 01:41:17AM IST, Alexei Starovoitov wrote: > > > On Thu, May 26, 2022 at 11:34:48PM +0200, Lorenzo Bianconi wrote: > > > > Changes since v3: > > > > - split bpf_xdp_ct_add in bpf_xdp_ct_alloc/bpf_skb_ct_alloc and > > > > bpf_ct_insert_entry > > > > - add verifier code to properly populate/configure ct entry > > > > - improve selftests > > > > > > Kumar, Lorenzo, > > > > > > are you planning on sending v5 ? > > > The patches 1-5 look good. > > > Patch 6 is questionable as Florian pointed out. > > > > Yes, it is almost there. > > > > > What is the motivation to allow writes into ct->status? > > > > It will only be allowed for ct from alloc function, after that ct = insert(ct) > > releases old one with new read only ct. I need to recheck once again with the > > code what other bits would cause problems on insert before I rework and reply. > > I still don't understand why you want to allow writing after alloc. > It is just a way to set the status, instead of a helper to set it. Eventually before nf_conntrack_hash_check_insert it will still be checked and error returned for disallowed bits (e.g. anything in IPS_UNCHANGEABLE_MASK, etc.). The current series missed that check. Florian is right in that it is a can of worms, but I think we can atleast permit things like confirmed, assured, etc. which can also be set when crafting a ct using netlink. Both of them can share the same check so it is consistent when done from kfunc or netlink side, and any changes internally wrt status bits are in sync. Anyway, if you don't like the direct write into ct, I can drop it, for now just insert a confirmed entry (since this was just for testing). That also means patch 3-6 are not strictly needed anymore, so they can be dropped, but I can keep them if you want, since they might be useful. Florian asked for the pipeline, it is like this: ct = bpf_xdp_ct_alloc(); ct->a = ...; // private ct, not yet visible to anyone but us ct->b = ...; or we would now set using helpers alloc_ct_set_status(ct, IPS_CONFIRMED); alloc_ct_set_timeout(ct, timeout); ... ct = bpf_ct_insert_entry(ct); // old alloc_ct release, new inserted nf_conn returned if (!ct) return -1; /* Inserted successfully */ In the earlier approach this ct->a could now not be written to, as it was inserted, instead of allocated ct, which insert function took as arg and invalidated, so BPF program held a read only pointer now. If we drop that approach all pointers are read only anyway, so writing won't be allowed either. > > > The selftest doesn't do that anyway. > > > > Yes, it wasn't updated, we will do that in v5. > > > > > Patch 7 (acquire-release pairs) is too narrow. > > > The concept of a pair will not work well. There could be two acq funcs and one release. > > > > That is already handled (you define two pairs: acq1, rel and acq2, rel). > > There is also an example: bpf_ct_insert_entry -> bpf_ct_release, > > bpf_xdp_ct_lookup -> ct_release. > > If we can represent that without all these additional btf_id sets > it would be much better. > > > > Please think of some other mechanism. Maybe type based? BTF? > > > Or encode that info into type name? or some other way. > > > > Hmm, ok. I kinda dislike this solution too. The other idea that comes to mind is > > encapsulating nf_conn into another struct and returning pointer to that: > > > > struct nf_conn_alloc { > > struct nf_conn ct; > > }; > > > > struct nf_conn_alloc *bpf_xdp_ct_alloc(...); > > struct nf_conn *bpf_ct_insert_entry(struct nf_conn_alloc *act, ...); > > > > Then nf_conn_alloc gets a different BTF ID, and hence the type can be matched in > > the prototype. Any opinions? > > Yes. Or maybe typedef ? > typedef struct nf_conn nf_conn__alloc; > typedef struct nf_conn nf_conn__ro; > > C will accept silent type casts from one type to another, > but BTF type checking can be strict? > Not sure. wrapping a struct works too, but extra '.ct' accessor > might be annoying? Unless you only use it with container_of(). > I would prefer double or triple underscore to highlight a flavor. > struct nf_conn___init {...} > The main benefit, of course, is no need for extra btf_id sets. > Different types take care of correct arg passing. > In that sense typedef idea doesn't quite work, > since BTF checks with typedef would be unnecessarily strict > compared to regular C type checking rules. That difference > in behavior might bite us later. > So let's go with struct wrapping. Makes sense, I will go with this. But now if we are not even allowing write to such allocated ct (probably only helpers that set some field and check value), it can just be an empty opaque struct for the BPF program, while it is still a nf_conn in the kernel. There doesn't seem to be much point in wrapping around nf_conn when reading from allocated nf_conn isn't going to be of any use. -- Kartikeya