On Mon, Apr 17, 2017 at 03:49:55PM -0400, David Miller wrote: > From: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> > Date: Sun, 16 Apr 2017 22:26:01 +0200 > > > The bpf tail-call use-case is a very good example of why the > > verifier cannot deduct the needed HEADROOM upfront. > > This brings up a very interesting question for me. > > I notice that tail calls are implemented by JITs largely by skipping > over the prologue of that destination program. > > However, many JITs preload cached SKB values into fixed registers in > the prologue. But they only do this if the program being JITed needs > those values. > > So how can it work properly if a program that does not need the SKB > values tail calls into one that does? For x86 JIT it's fine, since caching of skb values is not part of the prologue: emit_prologue(&prog); if (seen_ld_abs) emit_load_skb_data_hlen(&prog); and tail_call jumps into the next program as: EMIT4(0x48, 0x83, 0xC0, PROLOGUE_SIZE); /* add rax, prologue_size */ EMIT2(0xFF, 0xE0); /* jmp rax */ whereas inside emit_prologue() we have: B UILD_BUG_ON(cnt != PROLOGUE_SIZE); arm64 has similar proplogue skipping code and it's even simpler than x86, since it doesn't try to optimize LD_ABS/IND in assembler and instead calls into bpf_load_pointer() from generated code, so no caching of skb values at all. s390 jit has partial skipping of prologue, since bunch of registers are save/restored during tail_call and it looks fine to me as well. It's very hard to extend test_bpf.ko with tail_calls, since maps need to be allocated and populated with file descriptors which are not feasible to do from .ko. Instead we need a user space based test for it. We've started building one in tools/testing/selftests/bpf/test_progs.c much more tests need to be added. Thorough testing of tail_calls is on the todo list.