On Mon, Jan 25, 2021 at 6:22 AM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 1/24/21 11:06 AM, Yonghong Song wrote: > > > > > > On 1/22/21 4:44 PM, KP Singh wrote: > >> The script runs the BPF selftests locally on the same kernel image > >> as they would run post submit in the BPF continuous integration > >> framework. > >> > >> The goal of the script is to allow contributors to run selftests locally > >> in the same environment to check if their changes would end up breaking > >> the BPF CI and reduce the back-and-forth between the maintainers and the > >> developers. > >> > >> Signed-off-by: KP Singh <kpsingh@xxxxxxxxxx> > > > > Thanks! I tried the script, and it works great. > > > > Tested-by: Yonghong Song <yhs@xxxxxx> > > > > When I tried to apply the patch locally, I see the following warnings: > > -bash-4.4$ git apply ~/p1.txt > > /home/yhs/p1.txt:306: space before tab in indent. > > : ) > > /home/yhs/p1.txt:307: space before tab in indent. > > echo "Invalid Option: -$OPTARG requires an > > argument" > > warning: 2 lines add whitespace errors. > > > > Maybe you want to fix them. > > > > One issue I found with the following script, > > KBUILD_OUTPUT=/home/yhs/work/linux-bld/ > > tools/testing/selftests/bpf/run_in_vm.sh -- cat /sys/fs/bpf/progs.debug > > I see the following warning: > > > > [ 1.081000] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: > > 101, name: cat > > [ 1.081684] 3 locks held by cat/101: > > [ 1.082032] #0: ffff8880047770a0 (&p->lock){+.+.}-{3:3}, at: > > bpf_seq_read+0x3a/0x3d0 > > [ 1.082734] #1: ffffffff82d69800 (rcu_read_lock){....}-{1:2}, at: > > bpf_iter_run_prog+0x5/0x160 > > [ 1.083521] #2: ffff88800618c148 (&mm->mmap_lock#2){++++}-{3:3}, at: > > exc_page_fault+0x1a1/0x640 > > [ 1.084344] Preemption disabled at: > > [ 1.084346] [<ffffffff8108f913>] migrate_disable+0x33/0x80 > > [ 1.085207] CPU: 2 PID: 101 Comm: cat Not tainted > > 5.11.0-rc4-00524-g6e66fbb10597-dirty #1257 > > [ 1.085933] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > BIOS 1.9.3-1.el7.centos 04/01 > > /2014 > > [ 1.086747] Call Trace: > > [ 1.086961] dump_stack+0x77/0x97 > > [ 1.087294] ___might_sleep.cold.119+0xf2/0x106 > > [ 1.087702] exc_page_fault+0x1c1/0x640 > > [ 1.088056] asm_exc_page_fault+0x1e/0x30 > > [ 1.088413] RIP: 0010:bpf_prog_0a182df2d34af188_dump_bpf_prog+0xf5/0xbc8 > > [ 1.089009] Code: 00 00 8b 7d f4 41 8b 76 44 48 39 f7 73 06 48 01 fb > > 49 89 df 4c 89 7d d8 49 8b > > bd 20 01 00 00 48 89 7d e0 49 8b bd e0 00 00 00 <48> 8b 7f 20 48 01 d7 > > 48 89 7d e8 48 89 e9 48 83 c > > 1 d0 48 8b 7d c8 > > [ 1.090635] RSP: 0018:ffffc90000197dc8 EFLAGS: 00010282 > > [ 1.091100] RAX: 0000000000000000 RBX: ffff888005a60458 RCX: > > 0000000000000024 > > [ 1.091727] RDX: 00000000000002f0 RSI: 0000000000000509 RDI: > > 0000000000000000 > > [ 1.092384] RBP: ffffc90000197e20 R08: 0000000000000001 R09: > > 0000000000000000 > > [ 1.093014] R10: 0000000000000002 R11: 0000000000000000 R12: > > 0000000000020000 > > [ 1.093660] R13: ffff888006199800 R14: ffff88800474c480 R15: > > ffff888005a60458 > > [ 1.094314] ? bpf_prog_0a182df2d34af188_dump_bpf_prog+0xc8/0xbc8 > > [ 1.094871] bpf_iter_run_prog+0x75/0x160 > > [ 1.095231] __bpf_prog_seq_show+0x39/0x40 > > [ 1.095602] bpf_seq_read+0xf6/0x3d0 > > [ 1.095915] vfs_read+0xa3/0x1b0 > > [ 1.096226] ksys_read+0x4f/0xc0 > > [ 1.096527] do_syscall_64+0x2d/0x40 > > [ 1.096831] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 1.097287] RIP: 0033:0x7f13a43e3ec2 > > [ 1.097625] Code: c0 e9 b2 fe ff ff 50 48 8d 3d aa 36 0a 00 e8 65 eb > > 01 00 0f 1f 44 00 00 f3 0f > > 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 > > 56 c3 0f 1f 44 00 00 48 83 e > > c 28 48 89 54 24 > > [ 1.099232] RSP: 002b:00007fffed256bb8 EFLAGS: 00000246 ORIG_RAX: > > 0000000000000000 > > [ 1.099922] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: > > 00007f13a43e3ec2 > > [ 1.100576] RDX: 0000000000020000 RSI: 00007f13a42d0000 RDI: > > 0000000000000003 > > [ 1.101197] RBP: 00007f13a42d0000 R08: 00007f13a42cf010 R09: > > 0000000000000000 > > [ 1.101868] R10: 0000000000000022 R11: 0000000000000246 R12: > > 0000561599794c00 > > [ 1.102486] R13: 0000000000000003 R14: 0000000000020000 R15: > > 0000000000020000 > > > > Note that above `cat` is called during /sbin/init init process. > > ...... > > [ 0.964879] Run /sbin/init as init process > > starting pid 84, tty '': '/etc/init.d/rcS' > > ...... > > > > I checked the assembly code and the above error info and the reason > > is due to an exception (address 0) happens in bpf_prog iterator. > > > > SEC("iter/bpf_prog") > > int dump_bpf_prog(struct bpf_iter__bpf_prog *ctx) > > { > > struct seq_file *seq = ctx->meta->seq; > > __u64 seq_num = ctx->meta->seq_num; > > struct bpf_prog *prog = ctx->prog; > > struct bpf_prog_aux *aux; > > > > if (!prog) > > return 0; > > > > aux = prog->aux; > > if (seq_num == 0) > > BPF_SEQ_PRINTF(seq, " id name attached\n"); > > > > BPF_SEQ_PRINTF(seq, "%4u %-16s %s %s\n", aux->id, > > get_name(aux->btf, aux->func_info[0].type_id, > > aux->name), > > aux->attach_func_name, aux->dst_prog->aux->name); > > return 0; > > } > > > > In the above, aux->dst_prog == 0 and exception does not get caught > > properly and kernel complains. This might be due to > > ths `cat /sys/fs/bpf/progs.debug` is called too early (in init process) > > and something is not set up properly yet. > > > > In a different rootfs, I called `cat /sys/fs/bpf/progs.debug` after > > login prompt, and I did not see the error. > > > > If somebody knows what is the possible reason, that will be great. > > Otherwise, I will continue to debug this later. > > I did some investigation and found the root cause. > > In arch/x86/mm/fault.c, function do_user_addr_fault(), > > The following if condition is false when /sys/fs/bpf/progs.debug is > run during init time and is true when it is run after login prompt. > > if (unlikely(cpu_feature_enabled(X86_FEATURE_SMAP) && > !(hw_error_code & X86_PF_USER) && > !(regs->flags & X86_EFLAGS_AC))) > { > bad_area_nosemaphore(regs, hw_error_code, address); > return; > } > > Specifically, cpu_feature_enabled(X86_FEATURE_SMAP) is false when bpf > program is run at /sbin/init time and is true after login prompt. > > The false condition eventually leads the control to the following > code in do_user_addr_fault(). > > if (unlikely(!mmap_read_trylock(mm))) { > if (!user_mode(regs) && > !search_exception_tables(regs->ip)) { > /* > * Fault from code in kernel from > * which we do not expect faults. > */ > bad_area_nosemaphore(regs, hw_error_code, address); > return; > } > retry: > mmap_read_lock(mm); > } else { > /* > * The above down_read_trylock() might have succeeded in > * which case we'll have missed the might_sleep() from > * down_read(): > */ > might_sleep(); > } > > and since mmap_read_trylock(mm) is successful with return value 1, > might_sleep() is called and hence the warning. Do you think this needs to be worked around in the script? If so, I would prefer to do it in a separate patch so that we capture all the details. - KP > > > [...] > >> +} > >> + > > [...]