On Tue, Jun 14, 2022 at 07:53:37AM +0530, Kumar Kartikeya Dwivedi wrote: > On Tue, Jun 14, 2022 at 03:45:00AM IST, Alexei Starovoitov wrote: > > On Mon, Jun 13, 2022 at 9:14 AM Kumar Kartikeya Dwivedi > > <memxor@xxxxxxxxx> wrote: > > > > > > On Sun, Jun 12, 2022 at 01:41:17AM IST, Alexei Starovoitov wrote: > > > > On Thu, May 26, 2022 at 11:34:48PM +0200, Lorenzo Bianconi wrote: > > > > > Changes since v3: > > > > > - split bpf_xdp_ct_add in bpf_xdp_ct_alloc/bpf_skb_ct_alloc and > > > > > bpf_ct_insert_entry > > > > > - add verifier code to properly populate/configure ct entry > > > > > - improve selftests > > > > > > > > Kumar, Lorenzo, > > > > > > > > are you planning on sending v5 ? > > > > The patches 1-5 look good. > > > > Patch 6 is questionable as Florian pointed out. > > > > > > Yes, it is almost there. > > > > > > > What is the motivation to allow writes into ct->status? > > > > > > It will only be allowed for ct from alloc function, after that ct = insert(ct) > > > releases old one with new read only ct. I need to recheck once again with the > > > code what other bits would cause problems on insert before I rework and reply. > > > > I still don't understand why you want to allow writing after alloc. > > > > It is just a way to set the status, instead of a helper to set it. Eventually > before nf_conntrack_hash_check_insert it will still be checked and error > returned for disallowed bits (e.g. anything in IPS_UNCHANGEABLE_MASK, etc.). > The current series missed that check. > > Florian is right in that it is a can of worms, but I think we can atleast permit > things like confirmed, assured, etc. which can also be set when crafting a ct > using netlink. Both of them can share the same check so it is consistent when > done from kfunc or netlink side, and any changes internally wrt status bits are > in sync. > > Anyway, if you don't like the direct write into ct, I can drop it, for now just > insert a confirmed entry (since this was just for testing). That also means > patch 3-6 are not strictly needed anymore, so they can be dropped, but I can > keep them if you want, since they might be useful. > > Florian asked for the pipeline, it is like this: > > ct = bpf_xdp_ct_alloc(); > ct->a = ...; // private ct, not yet visible to anyone but us > ct->b = ...; > or we would now set using helpers > alloc_ct_set_status(ct, IPS_CONFIRMED); > alloc_ct_set_timeout(ct, timeout); In other cases it probably will be useful to write into allocated structs, but ct's timeout and status fields are a bit too special. It's probably cleaner to generalize ctnetlink_change_status/ctnetlink_change_timeout as kfuncs and let progs modify the fields through these two helpers. Especially since timeout and status&IPS_DYING_BIT are related. Let's indeed drop 3-6 for now. Though recognizing 'const' modifier in BTF is useful it creates ambiguity. Unreferenced ptr_to_btf_id is readonly anyway. This 'const' would make it readable with normal load vs probe_load, but that difference is too subtle for users. It looks like normal dereference in C in both cases. Let's keep existing probe_load only for now. > ... > ct = bpf_ct_insert_entry(ct); // old alloc_ct release, new inserted nf_conn returned > if (!ct) > return -1; > /* Inserted successfully */ > In the earlier approach this ct->a could now not be written to, as it was > inserted, instead of allocated ct, which insert function took as arg and > invalidated, so BPF program held a read only pointer now. If we drop that > approach all pointers are read only anyway, so writing won't be allowed either. > > > > > The selftest doesn't do that anyway. > > > > > > Yes, it wasn't updated, we will do that in v5. > > > > > > > Patch 7 (acquire-release pairs) is too narrow. > > > > The concept of a pair will not work well. There could be two acq funcs and one release. > > > > > > That is already handled (you define two pairs: acq1, rel and acq2, rel). > > > There is also an example: bpf_ct_insert_entry -> bpf_ct_release, > > > bpf_xdp_ct_lookup -> ct_release. > > > > If we can represent that without all these additional btf_id sets > > it would be much better. > > > > > > Please think of some other mechanism. Maybe type based? BTF? > > > > Or encode that info into type name? or some other way. > > > > > > Hmm, ok. I kinda dislike this solution too. The other idea that comes to mind is > > > encapsulating nf_conn into another struct and returning pointer to that: > > > > > > struct nf_conn_alloc { > > > struct nf_conn ct; > > > }; > > > > > > struct nf_conn_alloc *bpf_xdp_ct_alloc(...); > > > struct nf_conn *bpf_ct_insert_entry(struct nf_conn_alloc *act, ...); > > > > > > Then nf_conn_alloc gets a different BTF ID, and hence the type can be matched in > > > the prototype. Any opinions? > > > > Yes. Or maybe typedef ? > > typedef struct nf_conn nf_conn__alloc; > > typedef struct nf_conn nf_conn__ro; > > > > C will accept silent type casts from one type to another, > > but BTF type checking can be strict? > > Not sure. wrapping a struct works too, but extra '.ct' accessor > > might be annoying? Unless you only use it with container_of(). > > I would prefer double or triple underscore to highlight a flavor. > > struct nf_conn___init {...} > > The main benefit, of course, is no need for extra btf_id sets. > > Different types take care of correct arg passing. > > In that sense typedef idea doesn't quite work, > > since BTF checks with typedef would be unnecessarily strict > > compared to regular C type checking rules. That difference > > in behavior might bite us later. > > So let's go with struct wrapping. > > Makes sense, I will go with this. But now if we are not even allowing write to > such allocated ct (probably only helpers that set some field and check value), > it can just be an empty opaque struct for the BPF program, while it is still > a nf_conn in the kernel. There doesn't seem to be much point in wrapping around > nf_conn when reading from allocated nf_conn isn't going to be of any use. Let's not make it opaque. struct nf_conn is readonly with probe_load. No need to disable that access either for allocated nf_conn or inserted.