Re: [PATCH RFC bpf-next v1 16/32] bpf: Introduce BPF memory object model

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 8, 2022 at 7:46 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote:
>
> On Thu, 8 Sept 2022 at 16:18, Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Thu, Sep 8, 2022 at 4:50 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote:
> > >
> > > I slept over this. I think I can get behind this idea of implicit
> > > ctor/dtor. We might have open coded construction/destruction later if
> > > we want.
> > >
> > > I am however thinking of naming these helpers:
> > > bpf_kptr_new
> > > bpf_kptr_delete
> > > to make it clear it does a little more than just allocating the type.
> > > The open coded cases can later derive their allocation from the more
> > > bare bones bpf_kptr_alloc instead in the future.
> >
> > New names make complete sense. Good idea.
> >
> > > The main reason to have open coded-ness was being able to 'manage'
> > > resources once visibility reduces to current CPU (bpf_refcount_put,
> > > single ownership after xchg, etc.). Even with RCU, we won't allow
> > > touching the BPF special fields without refcount. bpf_spin_lock is
> > > different, as it protects more than just bpf special fields.
> > >
> > > But one can still splice or kptr_xchg before passing to bpf_kptr_free
> > > to do that. bpf_kptr_free is basically cleaning up whatever is left by
> > > then, forcefully. In the future, we might even be able to do elision
> > > of implicit dtors based on the seen data flow (splicing in single
> > > ownership implies list is empty, any other op will undo that, etc.) if
> > > there are big structs with too many fields. Can also support that in
> > > open coded cases.
> >
> > Right.
> >
> > >
> > > What I want to think about more is whether we should still force
> > > calling bpf_refcount_set vs always setting it to 1.
> > >
> > > I know we don't agree about whether list_add in shared mode should
> > > take ref vs transfer ref. I'm leaning towards transfer since that will
> > > be most intuitive. It then works the same way in both cases, single
> > > ownership only transfers the sole reference you have, so you lose
> > > access, but in shared you may have more than one. If you have just one
> > > you will still lose access.
> > >
> > > It will be odd for list_add to consume it in one case and not the
> > > other. People should already be fully conscious of how they are
> > > managing the lifetime of their object.
> > >
> > > It then seems better to require users to set the initial refcount
> > > themselves. When doing the initial linking it can be very cheap.
> > > Later get/put/inc are always available.
> > >
> > > But forcing it to be called is going to be much simpler than this patch.
> >
> > I'm not convinced yet :)
> > Pls hold on implementing one way or another.
> > Let's land the single ownership case for locks, lists,
> > rbtrees, allocators. That's plenty of patches.
> > Then we can start a deeper discussion into the shared case.
> > Whether it will be different in terms of 'lose access after list_add'
> > is not critical to decide now. It can change in the future too.
> >
>
> Right, I'm not implementing it yet. There's a lot of work left to even
> finish single ownership structures, then lots of testing.
> But it's helpful to keep thinking about future use cases while working
> on the current stuff, just to make sure we're not
> digging ourselves into a design hole.
>
> We have the option to undo damage here, since this is all
> experimental, but there's still an expectation that the API is not
> broken at whim. That wouldn't be very useful for users.

imo this part is minor.
The whole lock + list_or_rbtree in a single allocation
restriction bothers me a lot more.
We will find out for sure only when wwe have a prototype
of lock + list + rbtree and let folks who requested it
actually code things.

> > The other reason to do implicit inits and ref count sets is to
>
> I am not contesting implicit construction.
> Other lists already work with zero initialization so list_head seems
> more of an exception.
> But it's done for good reasons to avoid extra NULL checks
> unnecessarily, and make the implementation of list helpers more
> efficient and simple at the same time.
>
> > avoid fighting llvm.
> > obj = bpf_kptr_new();
> > obj->var1 = 1;
> > some_func(&obj->var2);
> > In many cases the compiler is allowed to sink stores.
> > If there are two calls that "init" two different fields
> > the compiler is allowed to change the order as well
> > even if it doesn't see the body of the function and the function is
> > marked as __pure. Technically initializers as pure functions.
>
> But bpf_refcount_set won't be marked __pure, neither am I proposing to
> allow direct stores to 'set' it.
> I'm not a compiler expert by any means, but AFAIK it should not be
> doing such reordering for functions otherwise.
> What if the function inside has a memory barrier? That would
> completely screw up things.
> It's going to have external linkage, so I don't think it can assume
> anything about side effects or not. So IMO this is not a good point.

The pure attribute tells the compiler that the function doesn't
have side effects. We even use it in the kernel code base.
Sooner or later we'll start using it in bpf too.
Things like memcmp is a primary example.
I have to correct myself though. refcount_set shouldn't be
considered pure.

> Unless you're talking about some new way of inlining such helpers from
> the compiler side that doesn't exist yet.
>
> > The verifier and llvm already "fight" a lot.
> > We gotta be very careful in the verifier and not assume
> > that the code stays as written in C.
>
> So will these implicit zero stores be done when we enter != NULL
> branch, or lazily on first access (helper arg, load, store)?

Whichever way is faster and still safe.
I assumed that we'd have to zero them after successful alloc.

> This is the flip side: rewritings insns to add stores to local kptr
> can only happen after the NULL check, in the != NULL branch, at that
> point we cannot assume R1-R5 are free for use, so complicated field
> initialization will be uglier to do implicitly (e.g. if it involves
> calling functions etc.).
> There are pros and cons for both.

Are you expecting the verifier to insert zero inits
as actual insns after 'call bpf_kptr_new' insn ?
Hmm. I imagined bpf_kptr_new helper will do it.
Just a simple loop that is inverse of zero_map_value().
Not the fastest thing, but good enough to start and can be
optimized later.
The verifier can insert ST insn too in !null branch,
since that insn only needs one register and it's known
in that branch.
It's questionable that a bunch of ST insns will be faster
than a zero_map_value-like loop.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux