On Tue, Jul 23, 2024 at 8:17 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Mon, Jul 22, 2024 at 8:27 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > We *need to support recursion* is my main point. > > > > > > Not quite. > > > It's not a recursion. The stack collapsed/gone/wiped out before tail_call. > > > > Only of subprog(), not of handle_tp(). See all those "ENTRY - AFTER" > > messages. We do return to all the nested handle_tp() calls and > > continue just fine. > > > > I put the log into [0] for a bit easier visual inspection. > > > > [0] https://gist.github.com/anakryiko/6ccdfc62188f8ad4991641fb637d954c > > Argh. So the pathological prog can consume 512*33 of stack. > We have to reject it somehow in the verifier or tailor private stack > to support it. Then private stack will be a feature and a fix for this issue. > But then it would need to preallocate 512*33 per cpu per program. > Which is too much. > Maybe we can preallocate _aligned_ 512 or 1k per cpu per prog, > then adjust r9 before call or tail_call and if r9 is about to cross > alignment before tail_call fail the tail call (like tail call cnt was > over limit). This is close to what I proposed to Yonghong offline. One approach I had in mind was as follows. If we know that a BPF program can do a tail call, then allocate some larger private stack (1KB, 4KB, 8KB, don't know), compared what the BPF program itself would need. Then in bpf_tail_call() helper's inlining itself check whether R9 + <max_prog_stack_size> is larger than the private stack's size. And if yes, then don't do tail call (as if we reached max number of tail calls). Tail call interface allows for that. This way we don't slow down typical non-tail call cases and don't pay unnecessary memory price, but we still make tail call work just fine in most cases, except some pathological ones like my example. I think the expected situation for tail call is to replace main program with another main program, so the typical case will work perfectly fine. > Hopefully there are better ideas, since it's all quite messy.