On Thu, 14 Mar 2024 at 12:08, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > Hi Kumar, > Hello Eduard, > Sorry for delayed response. Theoretical possibility that frame merge > might fail because of some e.g. clang version changes bugs me, > so I thought a bit on what alternatives are there. I spent some time this weekend addressing feedback, and prepping the next version. As you point out (and as I explained in one of the replies to you), the possibility of the same stack slot / register containing a conflicting resource type can cause merge problems. I have had a similar thought to what you describe below (emitting resource hints in such cases) to address this, but I slept over your idea and have a few thoughts. Please let me know what you think. > For the sake of discussion, what do you think about runtime-based > approach: > - for each throwing sub-program reserve a stack region where > references to currently acquired resources (and their types) > would be stored; > - upon a call to something that acquires resource push pointer/type > pair to this region; > - upon a call to something that releases resource delete pointer/type > pair from this region; > - when bpf_throw() is executed, walk this region for each active frame > and execute corresponding destructors. > - (alternatively, reserve one region for all frames). > I went over this option early on but rejected it due to the typical downsides you observed. The maximum stack usage due to this will be peak resource acquisition during the runtime of a subprogram (N) x 16 bytes. The maximum will be the sum of all subprogs in a call chain. This seems a bit problematic to me, given this can basically greatly exceed typical program stack usage. Compared to extra stack slot usage by bpf_loop inlining, and may_goto instruction, this appears to be too much. We can mitigate it by using a more space efficient encoding to identify resources, but the core problem still remains. Apart from that, there's the overhead to know zero of this stack portion on entry every time. I just think this is all unnecessary especially when most of the time the exception is not going to be thrown anyway, so it's all cost incurred for a case that will practically never happen in a correct program. > Technically, this could be implemented as follows: > - during main verification pass: > - when verifier processes a call to something that acquires resource, > mark call instruction and resource type in insn_aux_data data; > - same for processing of a call to something that releases resource; > - keep track of a maximal number of simultaneously acquired resources; > - after main verification pass: > - bump stack size for each subprogram by amount, necessary to hold > the acquired resource table and assume that this table is at the > end of the subprogram stack; > - after each acquire call, insert a kfunc call that would add > resource reference to the table; > - after each release call, insert a kfunc call that would remove > resource reference from the table. > > On a surface it appears that this approach has a few advantages: > - seems simpler than frame descriptors tracking and merging > (but only implementation would tell if this is so); > - should be resilient to program compilation changes; > - abort is possible at any program point, which might be interesting for: > - cancelable BPF programs (where abort might be needed in the middle > of the e.g. loop); Let us discuss cancellation in another set that I plan to send in some time after this one, but I think aborting arbitrarily at any point of the program is not the way to go. I have also reviewed literature on how this happens in other language runtimes, and the broad consensus is to have explicit cancellation points in programs, and only allow cancellation for them. Synchronous and asynchronous is irrelevant to the mechanism, but usually it is at explicit program points. Not only can you encounter a program executing in the middle of a loop, you can interrupt it within a kfunc, in such a case, you cannot really abort it immediately, as that may not be safe. It is thus better to designate cancellation points in the program, which are visited even in cases where the program may possibly be stuck for a long time (like iterator next kfunc) and only do cancellation when executing them. Anyway, let's continue this discussion once I send the RFC out for cancellation. Will try to also include a PoC for arena fault aborting. > - arenas, where it might be desirable to stop the program after e.g. > faulting arena access. > > The obvious disadvantage is incurred runtime cost. > On the other hand, it might be not that big. > > What do you think? Let's go back to the problem we can encounter with frame descriptors. The case of multiple paths having distinct resource types in a common stack slot. What I would do is only force a spill of the specific resource type (right after the pc acquires it) only if their stack slots are unmergeable. Since we go over the stack first, any conflicts in the register types will be resolved since all states for the ref_obj_id will be erased. For conflicts in registers, we'd do something similar. A new 8-byte entry would be reserved for the pc after max_stack_depth and all these entries will be zeroed upon entry. A similar thing could be done by the compiler itself and the verifier wouldn't have to do all this, but for now let's just do it in the verifier. In this way, when we encounter an unlikely case where the same slot has conflicting entries, we'll erase state by storing its entry in the stack slot reserved for the pc, and clear its state from the verifier state, allowing all respective entries for the same resource to now merge easily. The same can be repeated for each resource if all of them conflict. So far playing around with this series applying it to program cancellation and sched-ext cases, I haven't encountered merging issues, so it seems like an edge case that is unlikely to often occur in practice. But when it does, it can be resolved by the logic above. This should address all issues and should be resilient to changes triggered by different clang versions. Let me know what you think. > > Thanks, > Eduard