Re: tp_btf: if (!struct->pointer_member) always actually false although pointer_member == NULL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jun 28, 2020 at 10:25 PM Wenbo Zhang <ethercflow@xxxxxxxxx> wrote:
>
> I found in tp_btf program, direct access struct's pointer member's
> behaviour isn't consistent with
> BPF_CORE_READ. for example:
>
> SEC("tp_btf/block_rq_issue")
> int BPF_PROG(tp_btf__block_rq_issue, struct request_queue *q,
>     struct request *rq)
> {
>         /* After echo none > /sys/block/$dev/queue/scheduler,
>          * the $dev's q->elevator will be set to NULL.
>          */
>         if (!q->elevator)
>                 bpf_printk("direct access: noop\n");
>         if (!BPF_CORE_READ(q, elevator))
>                 bpf_printk("FROM CORE READ: noop\n");
>         return 0;
> }
>
> Although its value is NULL, from trace_pipe I can only see
>
> > FROM CORE READ: noop
>
> So it seems  `if (!q->elevator)` always return false.
>
> I tested it with kernel 5.7.0-rc7+ and 5.8.0-rc1+, both have this problem.
> clang version: clang version 10.0.0-4ubuntu1~18.04.1
>
> Reproduce step:
> 1. Run this bpf prog;
> 2. Run `cat /sys/kernel/debug/tracing/trace_pipe` in other window;
> 3. Run `echo none > /sys/block/sdc/queue/scheduler`;  # please replace
> sdc to your device;
> 4. Run `dd if=/dev/zero of=/dev/sdc  bs=1MiB count=200 oflag=direct`;
>

Thanks a lot for detailed bug report. I haven't executed it, but I can
see from BPF assembly in kernel that there is a bug in a kernel. See
below.

>
> The output of  `llvm-objdump-10 -D bio.bpf.o` is:

Next time please use -d or -S, that way it will disassemble only
actual code, not all of the sections. Just FYI.

>
>
> bio.bpf.o:      file format ELF64-BPF
>
>
> Disassembly of section tp_btf/block_rq_issue:
>
> 0000000000000000 tp_btf__block_rq_issue:
>        0:       b7 02 00 00 08 00 00 00 r2 = 8
>        1:       79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
>        2:       bf 16 00 00 00 00 00 00 r6 = r1
>        3:       0f 26 00 00 00 00 00 00 r6 += r2
>        4:       79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8)
>        5:       55 01 0e 00 00 00 00 00 if r1 != 0 goto +14 <LBB0_2>
>        6:       b7 01 00 00 00 00 00 00 r1 = 0
>        7:       73 1a fc ff 00 00 00 00 *(u8 *)(r10 - 4) = r1
>        8:       b7 01 00 00 6f 6f 70 0a r1 = 175140719
>        9:       63 1a f8 ff 00 00 00 00 *(u32 *)(r10 - 8) = r1
>       10:       18 01 00 00 63 63 65 73 00 00 00 00 73 3a 20 6e r1 =
> 7935406810958488419 ll
>       12:       7b 1a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r1
>       13:       18 01 00 00 64 69 72 65 00 00 00 00 63 74 20 61 r1 =
> 6998721791186332004 ll
>       15:       7b 1a e8 ff 00 00 00 00 *(u64 *)(r10 - 24) = r1
>       16:       bf a1 00 00 00 00 00 00 r1 = r10
>       17:       07 01 00 00 e8 ff ff ff r1 += -24
>       18:       b7 02 00 00 15 00 00 00 r2 = 21
>       19:       85 00 00 00 06 00 00 00 call 6
>
> 00000000000000a0 LBB0_2:
>       20:       bf a1 00 00 00 00 00 00 r1 = r10
>       21:       07 01 00 00 e8 ff ff ff r1 += -24
>       22:       b7 02 00 00 08 00 00 00 r2 = 8
>       23:       bf 63 00 00 00 00 00 00 r3 = r6
>       24:       85 00 00 00 04 00 00 00 call 4
>       25:       79 a1 e8 ff 00 00 00 00 r1 = *(u64 *)(r10 - 24)
>       26:       55 01 0e 00 00 00 00 00 if r1 != 0 goto +14 <LBB0_4>
>       27:       b7 01 00 00 0a 00 00 00 r1 = 10
>       28:       6b 1a fc ff 00 00 00 00 *(u16 *)(r10 - 4) = r1
>       29:       b7 01 00 00 6e 6f 6f 70 r1 = 1886351214
>       30:       63 1a f8 ff 00 00 00 00 *(u32 *)(r10 - 8) = r1
>       31:       18 01 00 00 45 20 52 45 00 00 00 00 41 44 3a 20 r1 =
> 2322243604989485125 ll
>       33:       7b 1a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r1
>       34:       18 01 00 00 46 52 4f 4d 00 00 00 00 20 43 4f 52 r1 =
> 5931033040285291078 ll
>       36:       7b 1a e8 ff 00 00 00 00 *(u64 *)(r10 - 24) = r1
>       37:       bf a1 00 00 00 00 00 00 r1 = r10
>       38:       07 01 00 00 e8 ff ff ff r1 += -24
>       39:       b7 02 00 00 16 00 00 00 r2 = 22
>       40:       85 00 00 00 06 00 00 00 call 6
>
> 0000000000000148 LBB0_4:
>       41:       b7 00 00 00 00 00 00 00 r0 = 0
>       42:       95 00 00 00 00 00 00 00 exit
>

There are two relocations on instruction #0 and #4, I double-checked,
libbpf correctly resolves them to 8 (byte offset of elevator field
within request_queue). This code dump also looks correct, so Clang
generates everything properly.

But if you dump loaded BPF program assembly, it becomes clear that
verifier is not doing the right thing. Here's in-kernel version dump:

[vmuser@archvm bpftool]$ sudo bpftool p d x id 3468
int tp_btf__block_rq_issue(long long unsigned int * ctx):
; int BPF_PROG(tp_btf__block_rq_issue, struct request_queue *q,
   0: (b7) r2 = 8
; int BPF_PROG(tp_btf__block_rq_issue, struct request_queue *q,
   1: (79) r1 = *(u64 *)(r1 +0)
   2: (bf) r6 = r1
   3: (0f) r6 += r2
; if (!q->elevator)
   4: (79) r1 = *(u64 *)(r1 +8)
; bpf_printk("direct access: noop\n");

Here verifier's analysis for whatever reason concluded that r1 is
always going to be != 0 and it eliminated entire if block.

It's a bit too late here for me to dig into this, I might take a look
tomorrow, unless someone beats me to it. But yeah, there clearly is a
bug in verifier branch prediction.


   5: (bf) r1 = r10
;
   6: (07) r1 += -24
; if (!BPF_CORE_READ(q, elevator))
   7: (b7) r2 = 8
   8: (bf) r3 = r6
   9: (85) call bpf_probe_read_compat#-147792
; if (!BPF_CORE_READ(q, elevator))
  10: (79) r1 = *(u64 *)(r10 -24)
; if (!BPF_CORE_READ(q, elevator))
  11: (55) if r1 != 0x0 goto pc+14
  12: (b7) r1 = 10
; bpf_printk("FROM CORE READ: noop\n");
  13: (6b) *(u16 *)(r10 -4) = r1
  14: (b7) r1 = 1886351214
  15: (63) *(u32 *)(r10 -8) = r1
  16: (18) r1 = 0x203a444145522045
  18: (7b) *(u64 *)(r10 -16) = r1
  19: (18) r1 = 0x524f43204d4f5246
  21: (7b) *(u64 *)(r10 -24) = r1
  22: (bf) r1 = r10
  23: (07) r1 += -24
  24: (b7) r2 = 22
  25: (85) call bpf_trace_printk#-150256
; int BPF_PROG(tp_btf__block_rq_issue, struct request_queue *q,
  26: (b7) r0 = 0
  27: (95) exit


[...]

>
>
> BTW, the llvm-objdump will core dump after output the above info:
>
> Stack dump:
> 0. Program arguments: llvm-objdump-10 -D bio.bpf.o
> /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x1f)[0x7f7636d5dc3f]
> /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0x50)[0x7f7636d5bf00]
> /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x978205)[0x7f7636d5e205]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f76361d9890]
> /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x21bbed3)[0x7f76385a1ed3]
> /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x21baefb)[0x7f76385a0efb]
> /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x21bc0ce)[0x7f76385a20ce]
> llvm-objdump-10[0x41b78c]
> llvm-objdump-10[0x425278]
> llvm-objdump-10[0x41f502]
> llvm-objdump-10[0x41a473]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f763546db97]
> llvm-objdump-10[0x41542a]
> [1]    21636 segmentation fault (core dumped)

This is not good, though dumping .BTF section as assembly is not a
usual case ;) Maybe Yonghong has some insights on this one, though.

>
> llvm-objdump-10 --version
> LLVM (http://llvm.org/):
>   LLVM version 10.0.0
>

[...]



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux