Re: [RFC PATCH bpf-next v4 0/4] bpf, x64: Fix tailcall infinite loop

Leon Hwang <hffilwlqm@xxxxxxxxx> · Mon, 4 Sep 2023 23:15:55 +0800

On 2023/9/4 21:10, Ilya Leoshkevich wrote:
> On Sun, 2023-09-03 at 23:14 +0800, Leon Hwang wrote:
>> This patch series fixes a tailcall infinite loop on x64.
>>
>> From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and
>> tailcall
>> handling in JIT"), the tailcall on x64 works better than before.
>>
>> From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF
>> subprograms
>> for x64 JIT"), tailcall is able to run in BPF subprograms on x64.
>>
>> From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF
>> program
>> to other BPF programs"), BPF program is able to trace other BPF
>> programs.
>>
>> How about combining them all together?
>>
>> 1. FENTRY/FEXIT on a BPF subprogram.
>> 2. A tailcall runs in the BPF subprogram.
>> 3. The tailcall calls the subprogram's caller.
>>
>> As a result, a tailcall infinite loop comes up. And the loop would
>> halt
>> the machine.
>>
>> As we know, in tail call context, the tail_call_cnt propagates by
>> stack
>> and rax register between BPF subprograms. So do in trampolines.
>>
>> How did I discover the bug?
>>
>> From commit 7f6e4312e15a5c37 ("bpf: Limit caller's stack depth 256
>> for
>> subprogs with tailcalls"), the total stack size limits to around
>> 8KiB.
>> Then, I write some bpf progs to validate the stack consuming, that
>> are
>> tailcalls running in bpf2bpf and FENTRY/FEXIT tracing on bpf2bpf[1].
>>
>> At that time, accidently, I made a tailcall loop. And then the loop
>> halted
>> my VM. Without the loop, the bpf progs would consume over 8KiB stack
>> size.
>> But the _stack-overflow_ did not halt my VM.
>>
>> With bpf_printk(), I confirmed that the tailcall count limit did not
>> work
>> expectedly. Next, read the code and fix it.
>>
>> Finally, unfortunately, I only fix it on x64 but other arches. As a
>> result, CI tests failed because this bug hasn't been fixed on s390x.
>>
>> Some helps on s390x are requested.
> 
> I will take a look, thanks for letting me know.

Great.

> I noticed there was something peculiar in this area when implementing
> the trampoline:
> 
> 	 * Note 1: The callee can increment the tail call counter, but
> 	 * we do not load it back, since the x86 JIT does not do this
> 	 * either.>
> but I thought that this was intentional.

I do think so.

But wait, should we load it back?

Let me show a demo:

struct {
	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
	__uint(max_entries, 4);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");

static __noinline
int subprog_tail_01(struct __sk_buff *skb)
{
	if (load_byte(skb, 0))
		bpf_tail_call_static(skb, &jmp_table, 1);
	else
		bpf_tail_call_static(skb, &jmp_table, 0);
	return 1;
}

static __noinline
int subprog_tail_23(struct __sk_buff *skb)
{
	if (load_byte(skb, 0))
		bpf_tail_call_static(skb, &jmp_table, 3);
	else
		bpf_tail_call_static(skb, &jmp_table, 2);
	return 1;
}

int count0 = 0;

SEC("tc")
int classifier_01(struct __sk_buff *skb)
{
	count0++;
	return subprog_tail_01(skb);
}

int count1 = 0;

SEC("tc")
int classifier_23(struct __sk_buff *skb)
{
	count1++;
	return subprog_tail_23(skb);
}

static __noinline
int subprog_tailcall(struct __sk_buff *skb, int index)
{
	volatile int retval = 0;
	bpf_tail_call(skb, &jmp_table, index);
	return retval;
}

SEC("tc")
int entry(struct __sk_buff *skb)
{
	subprog_tailcall(skb, 0);
	subprog_tailcall(skb, 2);

	return 0;
}

Finally, count0 is 33. And count1 is 33, too. It breaks the
MAX_TAIL_CALL_CNT limit by the way tailcall in subprog.

>From 04fd61ab36ec065e ("bpf: allow bpf programs to tail-call other bpf
programs"):
The chain of tail calls can form unpredictable dynamic loops therefore
tail_call_cnt is used to limit the number of calls and currently is set
to 32.

It seems like that MAX_TAIL_CALL_CNT limits the max tailcall hierarchy
layers.

So, should we load it back?

Thanks,
Leon