Re: [PATCH v2 bpf-next 01/13] bpf: Support multiple arg regs w/ ref_obj_id for kfuncs

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Thu, 29 Dec 2022 08:50:19 -0800

On Wed, Dec 28, 2022 at 10:40 PM David Vernet <void@xxxxxxxxxxxxx> wrote:
>
> On Sat, Dec 17, 2022 at 12:24:54AM -0800, Dave Marchevsky wrote:
> > Currently, kfuncs marked KF_RELEASE indicate that they release some
> > previously-acquired arg. The verifier assumes that such a function will
> > only have one arg reg w/ ref_obj_id set, and that that arg is the one to
> > be released. Multiple kfunc arg regs have ref_obj_id set is considered
> > an invalid state.
> >
> > For helpers, RELEASE is used to tag a particular arg in the function
> > proto, not the function itself. The arg with OBJ_RELEASE type tag is the
> > arg that the helper will release. There can only be one such tagged arg.
> > When verifying arg regs, multiple helper arg regs w/ ref_obj_id set is
> > also considered an invalid state.
> >
> > Later patches in this series will result in some linked_list helpers
> > marked KF_RELEASE having a valid reason to take two ref_obj_id args.
> > Specifically, bpf_list_push_{front,back} can push a node to a list head
> > which is itself part of a list node. In such a scenario both arguments
> > to these functions would have ref_obj_id > 0, thus would fail
> > verification under current logic.
> >
> > This patch changes kfunc ref_obj_id searching logic to find the last arg
> > reg w/ ref_obj_id and consider that the reg-to-release. This should be
> > backwards-compatible with all current kfuncs as they only expect one
> > such arg reg.
>
> Can't say I'm a huge fan of this proposal :-( While I think it's really
> unfortunate that kfunc flags are not defined per-arg for this exact type
> of reason, adding more flag-specific semantics like this is IMO a step
> in the wrong direction.  It's similar to the existing __sz and __k
> argument-naming semantics that inform the verifier that the arguments
> have special meaning. All of these little additions of special-case
> handling for kfunc flags end up requiring people writing kfuncs (and
> sometimes calling them) to read through the verifier to understand
> what's going on (though I will say that it's nice that __sz and __k are
> properly documented in [0]).

Before getting to pros/cons of KF_* vs name suffix vs helper style
per-arg description...
It's important to highlight that here we're talking about
link list and rb tree kfuncs that are not like other kfuncs.
Majority of kfuncs can be added by subsystems like hid-bpf
without touching the verifier.
Here we're paving the way for graph (aka new gen data structs)
and so far not only kfuncs, but their arg types have to have
special handling inside the verifier.
There is not much yet to generalize and expose as generic KF_
flag or as a name suffix.
Therefore I think it's more appropriate to implement them
with minimal verifier changes and minimal complexity.
There is no 3rd graph algorithm on the horizon after link list
and rbtree. Instead there is a big todo list for
'multi owner graph node' and 'bpf_refcount_t'.
Those will require bigger changes in the verifier,
so I'd like to avoid premature generalization :) as analogous
to premature optimization :)