On Fri, Nov 8, 2024 at 6:53 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > > > stack_depth = bpf_prog->aux->stack_depth; > + if (bpf_prog->aux->priv_stack_ptr) { > + priv_frame_ptr = bpf_prog->aux->priv_stack_ptr + round_up(stack_depth, 16); > + stack_depth = 0; > + } ... > + priv_stack_ptr = prog->aux->priv_stack_ptr; > + if (!priv_stack_ptr && prog->aux->jits_use_priv_stack) { > + priv_stack_ptr = __alloc_percpu_gfp(prog->aux->stack_depth, 16, GFP_KERNEL); After applying I started to see crashes running test_progs -j like: [ 173.465191] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000af9: 0000 [#1] PREEMPT SMP KASAN [ 173.466053] KASAN: probably user-memory-access in range [0x00000000000057c8-0x00000000000057cf] [ 173.466053] RIP: 0010:dst_dev_put+0x1e/0x220 [ 173.466053] Call Trace: [ 173.466053] <IRQ> [ 173.466053] ? die_addr+0x40/0xa0 [ 173.466053] ? exc_general_protection+0x138/0x1f0 [ 173.466053] ? asm_exc_general_protection+0x26/0x30 [ 173.466053] ? dst_dev_put+0x1e/0x220 [ 173.466053] rt_fibinfo_free_cpus.part.0+0x8c/0x130 [ 173.466053] fib_nh_common_release+0xd6/0x2a0 [ 173.466053] free_fib_info_rcu+0x129/0x360 [ 173.466053] ? rcu_core+0xa55/0x1340 [ 173.466053] rcu_core+0xa55/0x1340 [ 173.466053] ? rcutree_report_cpu_dead+0x380/0x380 [ 173.466053] ? hrtimer_interrupt+0x319/0x7c0 [ 173.466053] handle_softirqs+0x14c/0x4d0 [ 35.134115] Oops: general protection fault, probably for non-canonical address 0xe0000bfff101fbbc: 0000 [#1] PREEMPT SMP KASAN [ 35.135089] KASAN: probably user-memory-access in range [0x00007fff880fdde0-0x00007fff880fdde7] [ 35.135089] RIP: 0010:destroy_workqueue+0x4b4/0xa70 [ 35.135089] Call Trace: [ 35.135089] <TASK> [ 35.135089] ? die_addr+0x40/0xa0 [ 35.135089] ? exc_general_protection+0x138/0x1f0 [ 35.135089] ? asm_exc_general_protection+0x26/0x30 [ 35.135089] ? destroy_workqueue+0x4b4/0xa70 [ 35.135089] ? destroy_workqueue+0x592/0xa70 [ 35.135089] ? __mutex_unlock_slowpath.isra.0+0x270/0x270 [ 35.135089] ext4_put_super+0xff/0xd70 [ 35.135089] generic_shutdown_super+0x148/0x4c0 [ 35.135089] kill_block_super+0x3b/0x90 [ 35.135089] ext4_kill_sb+0x65/0x90 I think I see the bug... quoted it above... Please make sure you reproduce it first. Then let's figure out a way how to test for such things and what we can do to make kasan detect it sooner, since above crashes have no indication at all that bpf priv stack is responsible. If there is another bug in priv stack and it will cause future crashes we need to make sure that priv stack corruption is detected by kasan (or whatever mechanism) earlier. We cannot land private stack support when there is a possibility of such silent corruption. pw-bot: cr