Re: [RFC PATCH v1 05/14] bpf: Implement BPF exception frame descriptor generation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 15 Feb 2024 at 19:24, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Thu, 2024-02-01 at 04:21 +0000, Kumar Kartikeya Dwivedi wrote:
>
> Question: are there any real-life programs adapted to use exceptions
> with cleanup feature? It would be interesting to see how robust
> one-descriptor-per-pc is in practice, and also how it affects memory
> consumption during verification.
>

I tried it on sched-ext schedulers with this series with some helper
macros and it worked well.
I can post stats on memory consumption with the next version (or make
it part of veristat, whichever is more desirable).

I think the downside of this approach is when you have different
resources in the same register/slot from two different paths which
then hit the same bpf_throw. Surely, it will not work in that case the
way things are set up right now. But there are ways to address that
(ranging from compiler passes that hoist the throw call to the
predecessor basic blocks of separate paths when we detect such a
pattern, to emitting path hints of resource types during JIT which
bpf_throw can pick up), but I didn't encounter any examples so far
where this came up, and when it does, mostly you can rewrite things a
bit differently to make it work.

One of the changes though that I plan to make in the next posting is
keeping duplicate entries for the same resource around when the first
frame descriptor is created for a pc. This increases the chances of
merging correctly in case the second path intersects with us in some
slots, but for the purposes of releasing resources, that intersection
is sufficient.

Like you could have the same resource in R8 and R9, the second path
may only have it in R9 and not R8, both can merge, if we erase R8 (or
any other duplicates) in the original frame desc.

Right now, we simply do a release_resource so whenever we encounter
R8, we release its ref_obj_id and mark everything sharing the same id
invalid, so we never encounter R9 in the verifier state. In the above
case, these frame descs would not match but they could have if we had
discovered R9 before R8.

If you think this is not sufficient, please let me know. It is
certainly tricky and I might be underestimating the difficulty of
getting this right.

> The algorithm makes sense to me, a few comments/nits below.
>
> [...]
>
> > +static int find_and_merge_frame_desc(struct bpf_verifier_env *env, struct bpf_exception_frame_desc_tab *fdtab, u64 pc, struct bpf_frame_desc_reg_entry *fd)
> > +{
> > +     struct bpf_exception_frame_desc **descs = NULL, *desc = NULL, *p;
> > +     int ret = 0;
> > +
> > +     for (int i = 0; i < fdtab->cnt; i++) {
> > +             if (pc != fdtab->desc[i]->pc)
> > +                     continue;
> > +             descs = &fdtab->desc[i];
> > +             desc = fdtab->desc[i];
> > +             break;
> > +     }
> > +
> > +     if (!desc) {
> > +             verbose(env, "frame_desc: find_and_merge: cannot find frame descriptor for pc=%llu, creating new entry\n", pc);
> > +             return -ENOENT;
> > +     }
> > +
> > +     if (fd->off < 0)
> > +             goto stack;
>
> Nit: maybe write it down as
>
>         if (fd->off >= 0)
>                 return merge_frame_desc(...);
>
>      and avoid goto?
>

Ack, will fix.

> [...]
>
> > +static int gen_exception_frame_desc_stack_entry(struct bpf_verifier_env *env, struct bpf_func_state *frame, int stack_off)
> > +{
> > +     int spi = stack_off / BPF_REG_SIZE, off = -stack_off - 1;
> > +     struct bpf_reg_state *reg, not_init_reg, null_reg;
> > +     int slot_type, ret;
> > +
> > +     __mark_reg_not_init(env, &not_init_reg);
> > +     __mark_reg_known_zero(&null_reg);
>
> __mark_reg_known_zero() does not set .type field,
> thus null_reg.type value is undefined.
>

Hmm, good catch, will fix it.

> > +
> > +     slot_type = frame->stack[spi].slot_type[BPF_REG_SIZE - 1];
> > +     reg = &frame->stack[spi].spilled_ptr;
> > +
> > +     switch (slot_type) {
> > +     case STACK_SPILL:
> > +             /* We skip all kinds of scalar registers, except NULL values, which consume a slot. */
> > +             if (is_spilled_scalar_reg(&frame->stack[spi]) && !register_is_null(&frame->stack[spi].spilled_ptr))
> > +                     break;
> > +             ret = gen_exception_frame_desc_reg_entry(env, reg, off, frame->frameno);
> > +             if (ret < 0)
> > +                     return ret;
> > +             break;
> > +     case STACK_DYNPTR:
> > +             /* Keep iterating until we find the first slot. */
> > +             if (!reg->dynptr.first_slot)
> > +                     break;
> > +             ret = gen_exception_frame_desc_dynptr_entry(env, reg, off, frame->frameno);
> > +             if (ret < 0)
> > +                     return ret;
> > +             break;
> > +     case STACK_ITER:
> > +             /* Keep iterating until we find the first slot. */
> > +             if (!reg->ref_obj_id)
> > +                     break;
> > +             ret = gen_exception_frame_desc_iter_entry(env, reg, off, frame->frameno);
> > +             if (ret < 0)
> > +                     return ret;
> > +             break;
> > +     case STACK_MISC:
> > +     case STACK_INVALID:
> > +             /* Create an invalid entry for MISC and INVALID */
> > +             ret = gen_exception_frame_desc_reg_entry(env, &not_init_reg, off, frame->frameno);
> > +             if (ret < 0)
> > +                     return 0;
>
> No tests are failing if I comment out this block.
> Looking at the merge_frame_desc() logic it appears to me that fd
> entries with fd->type == NOT_INIT would only be merged with other
> NOT_INIT entries. What is the point of having such entries at all?
>

Assume you figured it out based on the other email.
Basically, creating entries so that no merge can occur for the slot.

> > +             break;
> > +     case STACK_ZERO:
> > +             reg = &null_reg;
> > +             for (int i = BPF_REG_SIZE - 1; i >= 0; i--) {
> > +                     if (frame->stack[spi].slot_type[i] != STACK_ZERO)
> > +                             reg = &not_init_reg;
> > +             }
> > +             ret = gen_exception_frame_desc_reg_entry(env, &null_reg, off, frame->frameno);
> > +             if (ret < 0)
> > +                     return ret;
>
> Same here, no tests are failing if STACK_ZERO block is commented.
> In general, what is the point of adding STACK_ZERO entries?
> There is a logic in place to merge NULL and non-NULL entries,
> but how is it different from not adding NULL entries in a first place?
> find_and_merge_frame_desc() does a linear scan over bpf_exception_frame_desc->stack
> and does not rely on entries being sorted by .off field.
>

Answered on the other email as well. I will add more tests for this case.
STACK_ZERO also creates NULL entries, we just treat them the same as
NULL reg and insert an entry using a fake reg initialized to zero.
Also I think we need to pass reg, not &null_reg in this case. Will fix
this part of the code and add tests.

While it does not rely on entries being sorted, we still find the one
with the same offset.
I can sort for a binary search at runtime, I was mostly worried that
constant re-sorting during verification may end up costing more when
the number of frame descriptors is low, but I think it makes sense
atleast for runtime when the array is only searched.

> > +             break;
> > +     default:
> > +             verbose(env, "verifier internal error: frame%d stack off=%d slot_type=%d missing handling for exception frame generation\n",
> > +                     frame->frameno, off, slot_type);
> > +             return -EFAULT;
> > +     }
> > +     return 0;
> > +}




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux