Re: [PATCH bpf] bpf: Proper R0 zero-extension for BPF_CALL instructions

Björn Töpel <bjorn@xxxxxxxxxx> · Tue, 06 Dec 2022 19:38:39 +0100

Yonghong Song <yhs@xxxxxxxx> writes:

> On 12/6/22 9:47 AM, Yonghong Song wrote:
>> 
>> 
>> On 12/6/22 5:21 AM, Ilya Leoshkevich wrote:
>>> On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote:
>>>> From: Björn Töpel <bjorn@xxxxxxxxxxxx>
>>>>
>>>> A BPF call instruction can be, correctly, marked with zext_dst set to
>>>> true. An example of this can be found in the BPF selftests
>>>> progs/bpf_cubic.c:
>>>>
>>>>    ...
>>>>    extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym;
>>>>
>>>>    __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk)
>>>>    {
>>>>            return tcp_reno_undo_cwnd(sk);
>>>>    }
>>>>    ...
>>>>
>>>> which compiles to:
>>>>    0:  r1 = *(u64 *)(r1 + 0x0)
>>>>    1:  call -0x1
>>>>    2:  exit
>>>>
>>>> The call will be marked as zext_dst set to true, and for some
>>>> backends
>>>> (bpf_jit_needs_zext() returns true) expanded to:
>>>>    0:  r1 = *(u64 *)(r1 + 0x0)
>>>>    1:  call -0x1
>>>>    2:  w0 = w0
>>>>    3:  exit
>>>
>>> In the verifier, the marking is done by check_kfunc_call() (added in
>>> e6ac2450d6de), right? So the problem occurs only for kfuncs?
>>>
>>>          /* Check return type */
>>>          t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
>>>
>>>          ...
>>>
>>>          if (btf_type_is_scalar(t)) {
>>>                  mark_reg_unknown(env, regs, BPF_REG_0);
>>>                  mark_btf_func_reg_size(env, BPF_REG_0, t->size);
>>>
>>> I tried to find some official information whether the eBPF calling
>>> convention requires sign- or zero- extending return values and
>>> arguments, but unfortunately [1] doesn't mention this.
>>>
>>> LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W*
>>> registers, but since assigning to W* leads to zero-extension, it seems
>>> to me that this is the case.
>> 
>> We actually follow the clang convention, the zero-extension is either
>> done in caller or callee, but not both. See 
>> https://reviews.llvm.org/D131598 ; how the convention could be changed.
>> 
>> The following is an example.
>> 
>> $ cat t.c
>> extern unsigned foo(void);
>> unsigned bar1(void) {
>>      return foo();
>> }
>> unsigned bar2(void) {
>>      if (foo()) return 10; else return 20;
>> }
>> $ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o
>> 
>> t.o:    file format elf64-bpf
>> 
>> Disassembly of section .text:
>> 
>> 0000000000000000 <bar1>:
>>         0:       85 10 00 00 ff ff ff ff call -0x1
>>         1:       95 00 00 00 00 00 00 00 exit
>> 
>> 0000000000000010 <bar2>:
>>         2:       85 10 00 00 ff ff ff ff call -0x1
>>         3:       bc 01 00 00 00 00 00 00 w1 = w0
>>         4:       b4 00 00 00 14 00 00 00 w0 = 0x14
>>         5:       16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2>
>>         6:       b4 00 00 00 0a 00 00 00 w0 = 0xa
>> 
>> 0000000000000038 <LBB1_2>:
>>         7:       95 00 00 00 00 00 00 00 exit
>> $
>> 
>> If the return value of 'foo()' is actually used in the bpf program, the
>> proper zero extension will be done. Otherwise, it is not done.
>> 
>> This is with latest llvm16. I guess we need to check llvm whether
>> we could enforce to add a w0 = w0 in bar1().
>> 
>> Otherwise, with this patch, it will add w0 = w0 in all cases which
>> is not necessary in most of practical cases.
>> 
>>>
>>> If the above is correct, then shouldn't we rather use sizeof(void *) in
>>> the mark_btf_func_reg_size() call above?
>>>
>>>> The opt_subreg_zext_lo32_rnd_hi32() function which is responsible for
>>>> the zext patching, relies on insn_def_regno() to fetch the register
>>>> to
>>>> zero-extend. However, this function does not handle call instructions
>>>> correctly, and opt_subreg_zext_lo32_rnd_hi32() fails the
>>>> verification.
>>>>
>>>> Make sure that R0 is correctly resolved for (BPF_JMP | BPF_CALL)
>>>> instructions.
>>>>
>>>> Fixes: 83a2881903f3 ("bpf: Account for BPF_FETCH in
>>>> insn_has_def32()")
>>>> Signed-off-by: Björn Töpel <bjorn@xxxxxxxxxxxx>
>>>> ---
>>>> I'm not super happy about the additional special case -- first
>>>> cmpxchg, and now call. :-( A more elegant/generic solution is
>>>> welcome!
>>>> ---
>>>>   kernel/bpf/verifier.c | 3 +++
>>>>   1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>>> index 264b3dc714cc..4f9660eafc72 100644
>>>> --- a/kernel/bpf/verifier.c
>>>> +++ b/kernel/bpf/verifier.c
>>>> @@ -13386,6 +13386,9 @@ static int
>>>> opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
>>>>                  if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn))
>>>>                          continue;
>>>> +               if (insn.code == (BPF_JMP | BPF_CALL))
>>>> +                       load_reg = BPF_REG_0;
>
> Want to double check. Do we actually have a problem here?
> For example, on x64, we probably won't have this issue.

The "problem" is that I hit this:
		if (WARN_ON(load_reg == -1)) {
			verbose(env, "verifier bug. zext_dst is set, but no reg is defined\n");
			return -EFAULT;
		}

This path is only taken for archs which have bpf_jit_needs_zext() ==
true. In my case it's riscv64, but it should hit i386, sparc, s390, ppc,
mips, and arm.

My reading of this thread has been that "marking the call has
zext_dst=true, is incorrect", i.e. that LLVM will insert the correct
zext instructions.

So, on way of not hitting this path, is what Ilya suggest -- in
check_kfunc_call():

  if (btf_type_is_scalar(t)) {
  	mark_reg_unknown(env, regs, BPF_REG_0);
  	mark_btf_func_reg_size(env, BPF_REG_0, t->size);
  }

change t->size to sizeof(u64). Then the call wont be marked.

>  >>>    ...
>  >>>    extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym;
>  >>>
>  >>>    __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk)
>  >>>    {
>  >>>            return tcp_reno_undo_cwnd(sk);
>  >>>    }
>
> The native code will return a 32-bit subreg to bpf program,
> and bpf didn't do anything and return r0 to the kernel func.
> In the kernel func, the kernel will take 32-bit subreg by
> x86_64 convention. This applies to some other return types
> like u8/s8/u16/s16/u32/s32.
>
> Which architecture you actually see the issue?

This is riscv64, but the nature of the problem is more of an assertion
failure, than codegen AFAIK.

I hit is when I load progs/bpf_cubic.o from the selftest. Nightly clang
from apt.llvm.org: clang version 16.0.0
(++20221204034339+7a194cfb327a-1~exp1~20221204154444.167)

Björn