On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote: > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote: > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote: > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier > > > currently enforces that the top-level type must match when calling > > > the kfunc. In other words, the verifier does not allow the BPF program > > > to pass a bitwise equivalent struct, despite it being functionally safe. > > > For example, if you have the following type: > > > > > > struct nf_conn___init { > > > struct nf_conn ct; > > > }; > > > > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a > > > struct nf_conn. > > > > Just running bpf_nf selftest would have shown this is false. > > And I feel silly, because I did run them, and could have sworn they > passed...looking now at the change_status_after_alloc testcase I see > you're of course correct. Very poor example, thank you for pointing it > out. > > > > > > Being able to do this will be useful for certain types > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a > > > series of kfuncs will be added which allow programs to do bitwise > > > queries on cpumasks that are either allocated by the program (in which > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as > > > its first element), or a cpumask that was allocated by the main kernel > > > (in which case it will just be a straight cpumask_t, as in > > > task->cpus_ptr). > > > > > > Having the two types of cpumasks allows us to distinguish between the > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t > > > cannot be. On the other hand, a struct bpf_cpumask can of course be > > > queried in the exact same manner as a cpumask_t, with e.g. > > > bpf_cpumask_test_cpu(). > > > > > > If we were to enforce that top level types match, then a user that's > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself > > > would need to be updated to expect the alias, and currently it only > > > accommodates a single alias per prog type). Additionally, not specifying > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a > > > struct bpf_cpumask *, and another as a struct cpumask * > > > (i.e. cpumask_t). > > > > > > In order to enable this, this patch relaxes the constraint that a > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to > > > try and be conservative and match existing behavior / expectations, this > > > patch also enforces strict type checking for acquire kfuncs. We were > > > already enforcing it for release kfuncs, so this should also improve the > > > consistency of the semantics for kfuncs. > > > > > > > What you want is to simply follow type at off = 0 (but still enforce the off = 0 > > requirement). This is something which is currently done for bpf_sk_release (for > > struct sk_common) in check_reg_type, but it is not safe in general to just open > > this up for all cases. I suggest encoding this particular requirement in the > > argument, and simply using triple underscore variant of the type for the special > > 'read_only' requirement. This will allow you to use same type in your BPF C > > program, while allowing verifier to see them as two different types in kfunc > > parameters. Then just relax type following for the particular argument so that > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0, > > it just visits first member after failing match on top level type). off = 0 > > check is still necessary. > > Sigh, yeah, another ___ workaround but I agree it's probably the best we > can do for now, and in general seems pretty useful. Obviously preferable > to this patch which just doesn't work. Alexei, are you OK with this? If > so, I'll take this approach for v2. We decided to rely on strict type match when we introduced 'struct nf_conn___init', but with that we twisted the C standard to, what looks to be, a wrong direction. For definition: struct nf_conn___init { struct nf_conn ct; }; if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init for both read and write, because in C that's valid and safe type cast. We can fix this design issue by saying that '___init' suffix is special and C type casting rules don't apply to it. In all other cases bpf_cpumask/cpumask would should allow it. __ro suffix idea will keep moving us into further discrepancies with C.