Re: [PATCH bpf-next 1/2] bpf: Migrate release_on_unlock logic to non-owning ref semantics

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Tue, 17 Jan 2023 16:41:42 -0800

On Tue, Jan 17, 2023 at 05:42:39PM -0500, Dave Marchevsky wrote:
> On 12/29/22 8:13 PM, Alexei Starovoitov wrote:
> > On Thu, Dec 29, 2022 at 5:07 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> >> @@ -11959,7 +11956,10 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
> >>                 dst_reg->type = PTR_TO_MAP_VALUE;
> >>                 dst_reg->off = aux->map_off;
> >>                 WARN_ON_ONCE(map->max_entries != 1);
> >> -               /* We want reg->id to be same (0) as map_value is not distinct */
> >> +               /* map->id is positive s32. Negative map->id will not collide with env->id_gen.
> >> +                * This lets us track active_lock state as single u32 instead of pair { map|btf, id }
> >> +                */
> >> +               dst_reg->id = -map->id;
> > 
> > Here is what I meant in my earlier reply to Dave's patches 1 and 2.
> > imo this is a simpler implementation of the same logic.
> > The only tricky part is above bit that is necessary
> > to use a single u32 in bpf_reg_state to track active_lock.
> > We can follow up with clean up of active_lock comparison
> > in other places of the verifier.
> > Instead of:
> >                 if (map)
> >                         cur->active_lock.ptr = map;
> >                 else
> >                         cur->active_lock.ptr = btf;
> >                 cur->active_lock.id = reg->id;
> > it will be:
> >   cur->active_lock_id = reg->id;
> > 
> > Another cleanup is needed to compare active_lock_id properly
> > in regsafe().
> > 
> > Thoughts?
> 
> Before Kumar's commit d0d78c1df9b1d ("bpf: Allow locking bpf_spin_lock global
> variables"), setting of dst_reg->id in that code path was guarded by "does
> map val have spin_lock check":
> 
>   if (btf_record_has_field(map->record, BPF_SPIN_LOCK))
>     dst_reg->id = ++env->id_gen;
> 
> Should we do that here as well? Not sure what the implications of overzealous
> setting of dst_reg->id are.

That won't work, since for spin_locks in global data that 'id' has to be stable.
Hence I went with -map->id.

> More generally: I see why doing -map->id will not overlap with env->id_gen
> in practice. For PTR_TO_BTF_ID, I'm not sure that we'll always have a nonzero
> reg->id here. check_kfunc_call has
> 
>   if (reg_may_point_to_spin_lock(&regs[BPF_REG_0]) && !regs[BPF_REG_0].id)
>     regs[BPF_REG_0].id = ++env->id_gen;

correct.

> when checking retval, but there's no such equivalent in check_helper_call,
> which instead does
> 
>   if (type_may_be_null(regs[BPF_REG_0].type))
>     regs[BPF_REG_0].id = ++env->id_gen;
> 
> and similar in a few paths.
> 
> Should we ensure that "any PTR_TO_BTF_ID reg that has a spin_lock must
> have a nonzero id"? If we don't, a helper which returns
> PTR_TO_BTF_ID w/ spin_lock that isn't MAYBE_NULL will break this whole scheme.

correct. it's not a problem in practice, because there are few helpers
that return PTR_TO_BTF_ID and none of them point to spin_lock.

> Some future-proofing concerns:
> 
> Kumar's commit above mentions this in the summary:
> 
>   Note that this can be extended in the future to also remember offset
>   used for locking, so that we can introduce multiple bpf_spin_lock fields
>   in the same allocation.

Yeah. I forgot about this 'offset' idea.

> When the above happens we'll need to go back to a struct - or some packed
> int - to describe "lock identity" anyways.

yeah.
I was hoping we can represent all of 'active_lock' as a single id,
but 'offset' is hard to encode into an integer.
We won't be able to add it in top 32-bits, since the resulting 64-bit 'id'
would need to go through check_ids(). Even if we extend idmap_scratch to be 64-bit
it won't be correct. Two different 32-bit IDs are ok to see and the states
might still be equivalent, but two different 'offset' are not ok.
The offset in 'old' state and offset in 'cur' state has to be the same for
states to be equivalent. So the future 'offset' would need to be compared in regsafe()
manually anyway. Since we'd have to do that there is little point combining
active_lock.ptr comparison into 32-bit id.
Sigh.

> IIRC in the long term we wanted to stop overloading reg->id's meaning,
> with the ideal end state being reg->id used only for "null id" purposes. If so,
> this is moving us back towards the overloaded state.
> 
> Happy to take this over and respin once we're done discussing, unless you
> want to do it.

Please go ahead.
Let's scratch my -map->id idea for now.
Let's go with embedding 'struct bpf_active_lock {void *ptr; u32 id;}' into bpf_reg_state,
but please make sure that bpf_reg_state grows by 8 bytes only.
Simply adding bpf_active_lock in btf_id part of it will grow by 16.

Note the other idea you had to keep a pointer to active_lock in bpf_reg_state is
too dangerous and probably incorrect. That pointer might become dangling after we
copy_verifier_state. All pointers in bpf_reg_state need to be stable objects.
The only exception is 'parent' which is very tricky on its own.

Maybe we can do
struct bpf_active_lock {
   u32 map_btf_id;  /* instead of void *ptr */
   u32 id;
};
and init map_btf_id with -map->id or btf->id;
It's guaranteed not to overlap and != 0.
Don't know whether other pointers will be there. Maybe premature optimization.