> On 12 Jan 2023, at 2:59 PM, Yonghong Song <yhs@xxxxxxxx> wrote: > > > > On 1/9/23 5:21 AM, Hao Sun wrote: >> Yonghong Song <yhs@xxxxxxxx> 于2022年12月18日周日 00:57写道: >>> >>> >>> >>> On 12/16/22 10:54 PM, Hao Sun wrote: >>>> >>>> >>>>> On 17 Dec 2022, at 1:07 PM, Yonghong Song <yhs@xxxxxxxx> wrote: >>>>> >>>>> >>>>> >>>>> On 12/14/22 11:49 PM, Hao Sun wrote: >>>>>> Hi, >>>>>> The following KASAN report can be triggered by loading and test >>>>>> running this simple BPF prog with a random data/ctx: >>>>>> 0: r0 = bpf_get_current_task_btf ; >>>>>> R0_w=trusted_ptr_task_struct(off=0,imm=0) >>>>>> 1: r0 = *(u32 *)(r0 +8192) ; >>>>>> R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) >>>>>> 2: exit >>>>>> I've simplified the C reproducer but didn't find the root cause. >>>>>> JIT was disabled, and the interpreter triggered UAF when executing >>>>>> the load insn. A slab-out-of-bound read can also be triggered: >>>>>> https://pastebin.com/raw/g9zXr8jU >>>>>> This can be reproduced on: >>>>>> HEAD commit: b148c8b9b926 selftests/bpf: Add few corner cases to test >>>>>> padding handling of btf_dump >>>>>> git tree: bpf-next >>>>>> console log: https://pastebin.com/raw/1EUi9tJe >>>>>> kernel config: https://pastebin.com/raw/rgY3AJDZ >>>>>> C reproducer: https://pastebin.com/raw/cfVGuCBm >>>>> >>>>> I I tried with your above kernel config and C reproducer and cannot reproduce the kasan issue you reported. >>>>> >>>>> [root@arch-fb-vm1 bpf-next]# ./a.out >>>>> func#0 @0 >>>>> 0: R1=ctx(off=0,imm=0) R10=fp0 >>>>> 0: (85) call bpf_get_current_task_btf#158 ; R0_w=trusted_ptr_task_struct(off=0,imm=0) >>>>> 1: (61) r0 = *(u32 *)(r0 +8192) ; R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) >>>>> 2: (95) exit >>>>> processed 3 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 >>>>> >>>>> prog fd: 3 >>>>> [root@arch-fb-vm1 bpf-next]# >>>>> >>>>> Your config indeed has kasan on. >>>> >>>> Hi, >>>> >>>> I can still reproduce this on a latest bpf-next build: 0e43662e61f25 >>>> (“tools/resolve_btfids: Use pkg-config to locate libelf”). >>>> The simplified C reproducer sometime need to be run twice to trigger >>>> the UAF. Also note that interpreter is required. Here is the original >>>> C reproducer that loads and runs the BPF prog continuously for your >>>> convenience: >>>> https://pastebin.com/raw/WSJuNnVU >>>> >>> >>> I still cannot reproduce with more than 10 runs. The config has jit off >>> so it already uses interpreter. It has kasan on as well. >>> # CONFIG_BPF_JIT is not set >>> >>> Since you can reproduce it, I guess it would be great if you can >>> continue to debug this. >>> >> The load insn ‘r0 = *(u32*) (current + 8192)’ is OOB, because sizeof(task_struct) >> is 7240 as shown in KASAN report. The issue is that struct task_struct is special, >> its runtime size is actually smaller than it static type size. In X86: >> task_struct->thread_struct->fpu->fpstate->union fpregs_state is >> /* >> * ... >> * The size of the structure is determined by the largest >> * member - which is the xsave area. The padding is there >> * to ensure that statically-allocated task_structs (just >> * the init_task today) have enough space. >> */ >> union fpregs_state { >> struct fregs_state fsave; >> struct fxregs_state fxsave; >> struct swregs_state soft; >> struct xregs_state xsave; >> u8 __padding[PAGE_SIZE]; >> }; >> In btf_struct_access(), the resolved size for task_struct is 10496, much bigger >> than its runtime size, so the prog in reproducer passed the verifier and leads >> to the oob. This can happen to all similar types, whose runtime size is smaller >> than its static size. >> Not sure how many similar cases are there, maybe special check to task_struct >> is enough. Any hint on how this should be addressed? > > This should a corner case, I am not aware of other allocations like this. > > For a normal program, if the access chain looks > like > task_struct->thread_struct->fpu->fpstate->fpregs_state->{fsave,fxsave, soft, xsave}, > we should not hit this issue. So I think we don't need to address this > issue in kernel. The test itself should filter this out. Maybe I didn’t make my point clear. The issue here is that the runtime size of task_struct is `arch_task_struct_size`, which equals to the following, see fpu__init_task_struct_size(): sizeof(task_struct) - sizeof(fpregs_state) + fpu_kernel_cfg.default_size However, the verifier validates task_struct access based on the ty info in btf_vmlinux, which equals to sizeof(task_struct), much bigger than `arch_ task_struct_size` due to the `__padding[PAGE_SIZE]` in fpregs_state, this leads to OOB access. We should not allow progs that access the task_struct beyond `arch_task_ struct_size` to be loaded. So maybe we should add a fixup in `btf_parse_ vmlinux()` to assign the correct size to each type of this chain: task_struct->thread_struct->fpu->fpstate->fpregs_state; Or maybe we should fix this during generating BTF of vmlinux?