On Mon, Sep 30, 2024 at 9:38 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > On Tue, 1 Oct 2024 at 06:31, Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > > > On Mon, 30 Sept 2024 at 17:03, Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Thu, Sep 26, 2024 at 4:45 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > > > > > > > > Add jit support for private stack. For a particular subtree, e.g., > > > > subtree_root <== stack depth 120 > > > > subprog1 <== stack depth 80 > > > > subprog2 <== stack depth 40 > > > > subprog3 <== stack depth 160 > > > > > > > > Let us say that private_stack_ptr is the memory address allocated for > > > > private stack. The frame pointer for each above is calculated like below: > > > > subtree_root <== subtree_root_fp = private_stack_ptr + 120 > > > > subprog1 <== subtree_subprog1_fp = subtree_root_fp + 80 > > > > subprog2 <== subtree_subprog2_fp = subtree_subprog1_fp + 40 > > > > subprog3 <== subtree_subprog1_fp = subtree_root_fp + 160 > > > > > > > > For any function call to helper/kfunc, push/pop prog frame pointer > > > > is needed in order to preserve frame pointer value. > > > > > > > > To deal with exception handling, push/pop frame pointer is also used > > > > surrounding call to subsequent subprog. For example, > > > > subtree_root > > > > subprog1 > > > > ... > > > > insn: call bpf_throw > > > > ... > > > > > > > > After jit, we will have > > > > subtree_root > > > > insn: push r9 > > > > subprog1 > > > > ... > > > > insn: push r9 > > > > insn: call bpf_throw > > > > insn: pop r9 > > > > ... > > > > insn: pop r9 > > > > > > > > exception_handler > > > > pop r9 > > > > ... > > > > where r9 represents the fp for each subprog. > > > > > > Kumar, > > > please review the interaction of priv_stack with exceptions. > > > > > > > Hm, I think it works fine (because of push_r9 around subprog calls), > > but I think it's not needed (unless I missed something). > > > > The kernel won't care about r9's value, so the only reason pop_r9 is > > being done in ex_handler is to restore the private stack value? > > In that case you can just push_r9 once for > > prog->aux->exception_boundary after emit_private_frame_ptr, since only > > this main subprog's stack will be reused to run the exception > > callback. Then keep the pop_r9 as is. And then you don't need to > > push_r9 and pop_r9 around subprog calls. > > > > But do you need to do it? It seems later on it will just be overwritten. > > Since exception_cb is a PSTACK_TREE_ROOT, you will set its r9 using mov. > > > > So it seems all of this may be unnecessary. > > > > Let me know if I misunderstood something. > > > > ... > > > > Another issue that I came across when reading through patches is how > > this interacts with extension progs. > > Say right now I have this call chain: > > > > main_subprog > > global_subprog (overriden by ext prog) > > subprog1 > > call helper > > subprog2 > > > > global_subprog (which is a separate ext prog attached there using > > freplace) is not using private stack, but main prog is. > > So when main_subprog calls into it, it is possible r9 gets overwritten > > since JIT didn't preserve it for helper calls from the ext prog. > > > > When subprog2 is called, this will mean r9 equals garbage? > > > > There was a vaguely similar decision to make for exceptions: if we > > keep unwinding beyond an extension prog frame, we might end up in the > > main subprog frame that did not push the necessary registers to > > restore kernel state from exception_cb (because it did not have > > bpf_throw during verification). So I ended up making the ext prog > > itself an exception boundary, so unwinding stops there and it would > > return a result to its caller prog which may or may not use > > exceptions. > > > > In my case, I cannot go and re-JIT old progs to make them exception > > aware (short of always pushing r12 and all callee regs). > > But in your case the changes would be contained to ext_prog, so you > > could just figure that out from the tgt_prog seen in > > bpf_check_attach_target (i.e. whether it is using priv_stack or not, > > and if so, extra JIT instrumentation to reuse and push/pop r9 around > > helper calls is necessary for this ext_prog). > > > > Oh, on second thought: > I guess the second issue is not a problem (yet) because you always push > and pop r9 around subprog calls. > But say you dropped that (if the earlier comment is correct about it > being unnecessary for exceptions), then such a case would surface. ... > push_r9 and pop_r9 around subprog calls. I don't see how you can avoid that. subprog will adjust r9, so the main prog needs to go back to its r9 value after subprog finishes. So it's either push/pop around subprogs or subprog has to r9 += sz in the prologue and r9 -= sz in the epilogue. while keeping r9 intact for the duration. It would still need to push/pop around kfuncs/helpers, since the kernel might use it. imo push/pop around every call is easier to reason about.