Re: [PATCH bpf-next v2 19/25] bpf: Introduce bpf_kptr_new

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 19, 2022 at 10:01:21PM IST, Alexei Starovoitov wrote:
> On Tue, Oct 18, 2022 at 10:58 PM Kumar Kartikeya Dwivedi
> <memxor@xxxxxxxxx> wrote:
> >
> > On Wed, Oct 19, 2022 at 08:01:24AM IST, Alexei Starovoitov wrote:
> > > On Thu, Oct 13, 2022 at 11:52:57AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > +void *bpf_kptr_new_impl(u64 local_type_id__k, u64 flags, void *meta__ign)
> > > > +{
> > > > +   struct btf_struct_meta *meta = meta__ign;
> > > > +   u64 size = local_type_id__k;
> > > > +   void *p;
> > > > +
> > > > +   if (unlikely(flags || !bpf_global_ma_set))
> > > > +           return NULL;
> > >
> > > Unused 'flags' looks weird in unstable api. Just drop it?
> > > And keep it as:
> > > void *bpf_kptr_new(u64 local_type_id__k, struct btf_struct_meta *meta__ign);
> > >
> > > and in bpf_experimental.h:
> > >
> > > extern void *bpf_kptr_new(__u64 local_type_id) __ksym;
> > >
> > > since __ign args are ignored during kfunc type match
> > > the bpf progs can use it without #define.
> > >
> >
> > It's ignored during check_kfunc_call, but libbpf doesn't ignore that. The
> > prototypes will not be the same. I guess I'll have to teach it do that during
> > type match, but IDK how you feel about that.
>
> libbpf does the full type match, really?
> Could you point me to the code?
>

Not full type match, but the number of arguments must be same, so it won't allow
having kfunc as:

void *bpf_kptr_new(u64 local_type_id__k, struct btf_struct_meta *meta__ign);

in the kernel and ksym declaration in the program as:

extern void *bpf_kptr_new(__u64 local_type_id) __ksym;

I get:

libbpf: extern (func ksym) 'bpf_kptr_new_impl': func_proto [25] incompatible with kernel [60043]

vlen of func_proto in kernel type is 2, for us it will be 1.

> > Otherwise unless you want people to manually pass something to the ignored
> > argument, we have to hide it behind a macro.
> >
> > I actually like the macro on top, then I don't even pass the type ID but the
> > type. But that's a personal preference, and I don't feel strongly about it.
> >
> > So in C one does malloc(sizeof(*p)), here we'll just write
> > bpf_kptr_new(typeof(*p)). YMMV.
>
> bpf_kptr_new(typeof(*p)) is cleaner.
>

So if we're having a macro anyway, the thing above can be hidden behind it.

> > > > +   p = bpf_mem_alloc(&bpf_global_ma, size);
> > > > +   if (!p)
> > > > +           return NULL;
> > > > +   if (meta)
> > > > +           bpf_obj_init(meta->off_arr, p);
> > >
> > > I'm starting to dislike all that _arr and _tab suffixes in the verifier code base.
> > > It reminds me of programming style where people tried to add types into
> > > variable names. imo dropping _arr wouldn't be just fine.
> >
> > Ack, I'll do it in v3.
> >
> > Also, I'd like to invite people to please bikeshed a bit over the naming of the
> > APIs, e.g. whether it should be bpf_kptr_drop vs bpf_kptr_delete.
>
> bpf_kptr_drop is more precise.
> delete assumes instant free which is not the case here.
>
> How about
> extern void *__bpf_obj_new(__u64 local_type_id) __ksym;
> extern void bpf_obj_drop(void *obj) __ksym;
> #define bpf_obj_new(t) \
>  (t *)__bpf_obj_new(bpf_core_type_id_local(t));
>
> kptr means 'kernel pointer'.
> Here we have program supplied object.
> It feels 'obj' is better than 'kptr' in this context.
>

I agree, I'll rename it to bpf_obj_*.

Also, that __bpf_obj_new doesn't work yet in clang [0] but I think we can rename
it to that once clang is fixed.

 [0]: https://reviews.llvm.org/D136041

> > In the BPF list API, it's named bpf_list_del but it's actually distinct from how
> > list_del in the kernel works. So it does make sense to give them a different
> > name (like pop_front/pop_back and push_front/push_back)?
> >
> > Because even bpf_list_add takes bpf_list_head, in the kernel there's no
> > distinction between node and head, so you can do list_add on a node as well, but
> > it won't be possible with the kfunc (unless we overload the head argument to
> > also work with nodes).
> >
> > Later we'll probably have to add bpf_list_node_add etc. that add before or after
> > a node to make that work.
> >
> > The main question is whether it should closely resembly the linked list API in
> > the kernel, or can it steer away considerably from that?
>
> If we do doubly linked list we should allow delete in
> the middle with
> bpf_list_del_any(head, node)
>
> and
> bpf_list_pop_front/pop_back(head)
>
> bpf_list_add(node, head) would match kernel style,
> but I think it's cleaner to have head as 1st arg.
> In that sense new pop/push/_front/_back are cleaner.
> And similar for rbtree.
>
> If we keep (node, head) and (rb_node, rb_root) order
> we should keep kernel names.

There's also tradeoffs in how various operations are done.

Right now we have bpf_list_del and bpf_list_del_tail doing pop_front and
pop_back.

To replicate the same in kernel style API, you would do:

struct foo *f = bpf_list_first_entry_or_null(head);
if (!f) {}
bpf_list_del(&f->node);

Between those two calls, you might do bpf_list_last_entry -> bpf_list_del, the
verifier is going to have a hard time proving the aliasing of the two nodes, so
it will have to invalidate all pointers peeking into the list whenever something
modifies it. You would still be able to access memory but cannot pass it to list
ops anymore.

But I think we will pay this cost eventually anyway, people will add a peek
operation and such analysis would have to be done then. So I think it's unlikely
we can avoid this, and it might be better to make things more consistent and end
up mirroring the kernel list APIs.

WDYT?



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux