On Wed, 5 Aug 2020 at 01:20, Martin KaFai Lau <kafai@xxxxxx> wrote: > > On Tue, Aug 04, 2020 at 05:45:43PM +0100, Lorenz Bauer wrote: > > On Tue, 4 Aug 2020 at 17:05, Martin KaFai Lau <kafai@xxxxxx> wrote: > > > > > > On Tue, Aug 04, 2020 at 12:21:01PM +0100, Lorenz Bauer wrote: > > > > Hi list, > > > > > > > > I just got this warning while running test progs on commit > > > > 21594c44083c375697d418729c4b2e4522cf9f70 in the #4/22 > > > > bpf_sk_storage_map test. > > > > > > > > [ 38.775254] ============================= > > > > [ 38.775692] [ BUG: Invalid wait context ] > > > > [ 38.776234] 5.8.0-rc6+ #35 Not tainted > > > > [ 38.776699] ----------------------------- > > > > [ 38.777141] test_progs/254 is trying to lock: > > > > [ 38.777650] ffff8881197f4450 (&krcp->lock){....}-{3:3}, at: > > > > kfree_call_rcu+0x1a6/0x5b0 > > > > [ 38.778589] other info that might help us debug this: > > > > [ 38.779155] context-{5:5} > > > > [ 38.779476] 3 locks held by test_progs/254: > > > > [ 38.779944] #0: ffff88810dc3f020 > > > > (&sb->s_type->i_mutex_key#6){+.+.}-{4:4}, at: > > > > __sock_release+0x76/0x280 > > > > [ 38.780957] #1: ffffffff83d66840 (rcu_read_lock){....}-{1:3}, at: > > > > bpf_sk_storage_free+0x0/0x2a0 > > > > [ 38.781878] #2: ffff88810b6940b8 (&sk_storage->lock){+...}-{2:2}, > > > > at: bpf_sk_storage_free+0xa3/0x2a0 > > > > [ 38.782812] stack backtrace: > > > > [ 38.783115] CPU: 1 PID: 254 Comm: test_progs Not tainted 5.8.0-rc6+ #35 > > > > [ 38.783789] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > > > BIOS 1.13.0-1ubuntu1 04/01/2014 > > > > [ 38.784711] Call Trace: > > > > [ 38.784976] dump_stack+0x9e/0xe0 > > > > [ 38.785429] __lock_acquire.cold+0x3a6/0x46c > > > > [ 38.786025] ? register_lock_class+0x17a0/0x17a0 > > > > [ 38.786588] lock_acquire+0x1be/0x7e0 > > > > [ 38.787010] ? kfree_call_rcu+0x1a6/0x5b0 > > > > [ 38.787421] ? check_flags+0x60/0x60 > > > > [ 38.787790] ? mark_lock+0x12c/0x1470 > > > > [ 38.788179] ? check_chain_key+0x215/0x5a0 > > > > [ 38.788613] ? print_usage_bug+0x1f0/0x1f0 > > > > [ 38.789036] _raw_spin_lock+0x2c/0x70 > > > > [ 38.789415] ? kfree_call_rcu+0x1a6/0x5b0 > > > > [ 38.789829] kfree_call_rcu+0x1a6/0x5b0 > > > > [ 38.790239] __selem_unlink_sk+0x1eb/0x520 > > > > [ 38.790691] bpf_sk_storage_free+0x118/0x2a0 > > > > [ 38.791138] __sk_destruct+0xd3/0x4d0 > > > > [ 38.791525] inet_release+0xf4/0x220 > > > > [ 38.791939] __sock_release+0xc0/0x280 > > > > [ 38.792328] sock_close+0xf/0x20 > > > > [ 38.792676] __fput+0x29b/0x7b0 > > > > [ 38.793004] task_work_run+0xcc/0x170 > > > > [ 38.793429] __prepare_exit_to_usermode+0x1c6/0x1d0 > > > > [ 38.793948] do_syscall_64+0x62/0xa0 > > > > [ 38.794407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > [ 38.795004] RIP: 0033:0x7f73410e33d7 > > > > [ 38.795376] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 > > > > 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 > > > > 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 f3 fb > > > > ff ff > > > > [ 38.797266] RSP: 002b:00007ffc86615fe8 EFLAGS: 00000246 ORIG_RAX: > > > > 0000000000000003 > > > > [ 38.798030] RAX: 0000000000000000 RBX: 00007ffc86616040 RCX: 00007f73410e33d7 > > > > [ 38.798755] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000008 > > > > [ 38.799488] RBP: 000055fa708beb60 R08: 0000000000000007 R09: 000000000000002c > > > > [ 38.800241] R10: 000055fa6abd508d R11: 0000000000000246 R12: 000000000000000d > > > > [ 38.801007] R13: 00007ffc86616034 R14: 000000000000000c R15: 000055fa708bcb30 > > > > > > > > Not sure if this is useful or not, a quick search in the mailing list > > > > didn't bring anything up. > > > I cannot reproduce after running test_progs -t bpf_iter/bpf_sk_storage_map in > > > a loop. Can you share you config? > > > > Here you go: https://gist.github.com/lmb/0f19df936b296df34e4cb90dec6c3032 > I can reproduce now. The splat is from the "CONFIG_PROVE_RAW_LOCK_NESTING=y" > in the posted config above: > > config PROVE_RAW_LOCK_NESTING > bool "Enable raw_spinlock - spinlock nesting checks" > depends on PROVE_LOCKING > default n > help > Enable the raw_spinlock vs. spinlock nesting checks which ensure > that the lock nesting rules for PREEMPT_RT enabled kernels are > not violated. > > NOTE: There are known nesting problems. So if you enable this > option expect lockdep splats until these problems have been fully > addressed which is work in progress. This config switch allows to > identify and analyze these problems. It will be removed and the > check permanentely enabled once the main issues have been fixed. > > If unsure, select N. > > There are known issues in kfree_rcu() in PREEMPT_RT. My understanding is > some of them are being worked on, e.g. > https://lore.kernel.org/lkml/20200624201226.21197-2-paulmck@xxxxxxxxxx/ > https://lore.kernel.org/lkml/20200803163029.1997-1-urezki@xxxxxxxxx/ Thanks for taking a look, I'll disable this for now then! -- Lorenz Bauer | Systems Engineer 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK www.cloudflare.com