On Tue, 1 Oct 2024 at 21:53, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > Another idea... > Thanks for explaining why push/pop is still necessary. I agree then it seems it cannot be avoided. > Currently the prologue looks like: > push rbp > mov rbp, rsp > sub rsp, stack_depth > > how about in the main prog we keep the first two insns, > but then set rsp with a single insn to point to the top > of our private stack that should have enough room > for stack_of_main_prog + stacks_of_all_subprogs + extra 8k for kfuncs/helpers. > > The prologue of all subprogs will stay as-is with above 3 insns. > The epilogue is the same in main prog and subprogs: leave + ret. > > Such stack will look like a typical split stack used in compilers. > > The obvious advantage is we don't need to touch r9, do push/pop, > and stack unwind will work just fine. > In the past we discussed something like this, but > then we did all 3 insns in the private stack > and it was problematic due to IRQs. > In this approach the main prog will use up to 512 bytes of > kernel stack, but everything that it calls will be in the private stack, > and since it doesn't migrate there is no per-cpu memory reuse issue. > I think this is much better, but I'm wondering how the hierarchical scheduling case will occur in reality. Will it be the main prog invoking a kfunc, that in turn invokes another prog, which can do the same thing again? If so, the lack of using a private stack for main prog would be a problem, right? Because effectively if we don't call into subprogs we don't use the private stack at all, and all invocations share the same kernel stack, which brings us back to the current state. Instead can we set rbp to point to the top of the private stack in the main prog itself? > Thoughts?