Re: [PATCH bpf-next 13/17] bpf: add support for open-coded iterator loops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 04, 2023 at 03:27:46PM -0800, Andrii Nakryiko wrote:
> particular type. This will automatically work better for kfuncs as
> new/next/destroy trios will have the same `struct bpf_iter_<type> *`
> and it won't be possible to accidentally pass wrong bpf_iter_<type> to
> wrong new/next/destroy method.

Exactly.

> >
> > > +
> > > +static int mark_stack_slots_iter(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> > > +                              enum bpf_arg_type arg_type, int insn_idx)
> > > +{
> > > +     struct bpf_func_state *state = func(env, reg);
> > > +     enum bpf_iter_type type;
> > > +     int spi, i, j, id;
> > > +
> > > +     spi = iter_get_spi(env, reg);
> > > +     if (spi < 0)
> > > +             return spi;
> > > +
> > > +     type = arg_to_iter_type(arg_type);
> > > +     if (type == BPF_ITER_TYPE_INVALID)
> > > +             return -EINVAL;
> >
> > Do we need destroy_if_dynptr_stack_slot() equivalent here?
> 
> no, because bpf_iter is always ref-counted, so we'll always have
> explicit unmark_stack_slots_iter() call, which will reset slot types
> 
> destroy_if_dynptr_stack_slot() was added because LOCAL dynptr doesn't
> require explicit destruction. I mentioned this difference
> (simplification for bpf_iter case) somewhere in the commit message.

I see. Makes sense.

> > >
> > >               /* regular write of data into stack destroys any spilled ptr */
> > >               state->stack[spi].spilled_ptr.type = NOT_INIT;
> > > -             /* Mark slots as STACK_MISC if they belonged to spilled ptr. */
> > > -             if (is_spilled_reg(&state->stack[spi]))
> > > +             /* Mark slots as STACK_MISC if they belonged to spilled ptr/dynptr/iter. */
> > > +             if (is_stack_slot_special(&state->stack[spi]))
> > >                       for (i = 0; i < BPF_REG_SIZE; i++)
> > >                               scrub_spilled_slot(&state->stack[spi].slot_type[i]);
> >
> > It fixes something for dynptr, right?
> 
> It's convoluted, I think it might not have a visible effect. This is
> the situation of partial (e.g., single byte) overwrite of
> STACK_DYNPTR/STACK_ITER, and without this change we'll leave some
> slot_types as STACK_MISC, while others as STACK_DYNPTP/STACK_ITER.
> This is unexpected state, but I think existing code always checks that
> for STACK_DYNPTR's all 8 slots are STACK_DYNPTR.
> 
> So I think it's a good clean up, but no consequences for dynptr
> correctness. And for STACK_ITER I don't have to worry about such mix,
> if any of the slot_type[i] is STACK_ITER, then it's a correct
> iterator.

agree. I was just curious.

> > > +static bool is_iter_next_insn(struct bpf_verifier_env *env, int insn_idx, int *reg_idx)
> > > +{
> > > +     struct bpf_insn *insn = &env->prog->insnsi[insn_idx];
> > > +     const struct btf_param *args;
> > > +     const struct btf_type *t;
> > > +     const struct btf *btf;
> > > +     int nargs, i;
> > > +
> > > +     if (!bpf_pseudo_kfunc_call(insn))
> > > +             return false;
> > > +     if (!is_iter_next_kfunc(insn->imm))
> > > +             return false;
> > > +
> > > +     btf = find_kfunc_desc_btf(env, insn->off);
> > > +     if (IS_ERR(btf))
> > > +             return false;
> > > +
> > > +     t = btf_type_by_id(btf, insn->imm);     /* FUNC */
> > > +     t = btf_type_by_id(btf, t->type);       /* FUNC_PROTO */
> > > +
> > > +     args = btf_params(t);
> > > +     nargs = btf_vlen(t);
> > > +     for (i = 0; i < nargs; i++) {
> > > +             if (is_kfunc_arg_iter(btf, &args[i])) {
> > > +                     *reg_idx = BPF_REG_1 + i;
> > > +                     return true;
> > > +             }
> > > +     }
> >
> > This is some future-proofing ?
> > The commit log says that all iterators has to in the form:
> > bpf_iter_<kind>_next(struct bpf_iter* it)
> > Should we check for one and only arg here instead?
> 
> Yeah, a bit of generality. For a long time I had an assumption
> hardcoded about first argument being struct bpf_iter *, but that felt
> unclean, so I generalized that before submission.
> 
> But I can't think why we wouldn't just dictate (and enforce) that
> `struct bpf_iter *` is first. It makes sense, it's clean, and we lose
> nothing. This is another thing that differs between dynptr and iter,
> for dynptr such restriction wouldn't make sense.
> 
> Where would be a good place to enforce this for iter kfuncs?

I would probably just remove is_iter_next_insn() completely, hard code BPF_REG_1
and add a big comment for now.

In the follow up we can figure out how to:
BUILD_BUG_ON(!__same_type(bpf_iter_num_next, some canonical proto for iter_next));

Like we do:
  BUILD_BUG_ON(!__same_type(ops->map_lookup_elem,
               (void *(*)(struct bpf_map *map, void *key))NULL));

> >
> > 'depth' part of bpf_reg_state will be checked for equality in regsafe(), right?
> 
> no, it is explicitly skipped (and it's actually stacksafe(), not
> regsafe()). I can add explicit comment that we *ignore* depth

Ahh. That's stacksafe() indeed.
Would be great to add a comment to:
+                       if (old_reg->iter.type != cur_reg->iter.type ||
+                           old_reg->iter.state != cur_reg->iter.state ||
+                           !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap))
+                               return false;

that depth is explicitly not compared.

> I was considering adding a flag to states_equal() whether to check
> depth or not (that would make iter_active_depths_differ unnecessary),
> but it doesn't feel right. Any preferences one way or the other?

probably overkill. just comment should be enough.

> > Everytime we branch out in process_iter_next_call() there is depth++
> > So how come we're able to converge with:
> >  +                     if (is_iter_next_insn(env, insn_idx, &iter_arg_reg_idx)) {
> >  +                             if (states_equal(env, &sl->state, cur)) {
> > That's because states_equal() is done right before doing process_iter_next_call(), right?
> 
> Yes, we check convergence before we process_iter_next_call. We do
> converge because we ignore depth, as I mentioned above.
> 
> >
> > So depth counts the number of times bpf_iter*_next() was processed.
> > In other words it's a number of ways the body of the loop can be walked?
> 
> More or less, yes. It's a number of sequential unrolls of loop body,
> each time with a different starting state. But all that only in the
> current code path. So it's not like "how many different loop states we
> could have" in total. It's number of loop unrols with the condition
> "assuming current code path that led to start of iterator loop". Some
> other code path could lead to the state (before iterator loop starts)
> that converges faster or slower, which is why I'm pointing out the
> distinction.

got it. I know the comments are big and extensive already, but little bit more
won't hurt.

> >
> > > +                     }
> > > +                     /* attempt to detect infinite loop to avoid unnecessary doomed work */
> > > +                     if (states_maybe_looping(&sl->state, cur) &&
> >
> > Maybe cleaner is to remove above 'goto' and do '} else if (states_maybe_looping' here ?
> 
> I can undo this, it felt cleaner with explicit "skip infinite loop
> check" both for new code and for that async_entry_cnt check above. But
> I can revert to if/else if/else if pattern, though I find it harder to
> follow, given all this code (plus comments) is pretty long, so it's
> easy to lose track when reading

I'm fine whichever way. I just remembered that you typically try to avoid goto-s
and seeing goto here that could have easily been 'else' raised my internal alarm
that I could be missing something subtle in the code here.

> 
> >
> > > +                         states_equal(env, &sl->state, cur) &&
> > > +                         !iter_active_depths_differ(&sl->state, cur)) {
> > >                               verbose_linfo(env, insn_idx, "; ");
> > >                               verbose(env, "infinite loop detected at insn %d\n", insn_idx);
> > >                               return -EINVAL;



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux