On 2020-02-27 07:18, Dmitry Vyukov wrote: > On Wed, Oct 16, 2019 at 11:58 AM Gao Xiang <gaoxiang25@xxxxxxxxxx> wrote: >> >> Hi, >> >> On Wed, Oct 16, 2019 at 02:27:07AM -0700, syzbot wrote: >>> syzbot has found a reproducer for the following crash on: >>> >>> HEAD commit: 0e9d28bc Add linux-next specific files for 20191015 >>> git tree: linux-next >>> console output: https://syzkaller.appspot.com/x/log.txt?x=11745608e00000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=3d84ca04228b0bf4 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=36baa6c2180e959e19b1 >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=159d297f600000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16289b30e00000 >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>> Reported-by: syzbot+36baa6c2180e959e19b1@xxxxxxxxxxxxxxxxxxxxxxxxx >>> >>> ===================================== >>> WARNING: bad unlock balance detected! >>> 5.4.0-rc3-next-20191015 #0 Not tainted >>> ------------------------------------- >>> syz-executor276/8897 is trying to release lock (rcu_callback) at: >>> [<ffffffff8160e7a4>] __write_once_size include/linux/compiler.h:226 [inline] >>> [<ffffffff8160e7a4>] __rcu_reclaim kernel/rcu/rcu.h:221 [inline] >>> [<ffffffff8160e7a4>] rcu_do_batch kernel/rcu/tree.c:2157 [inline] >>> [<ffffffff8160e7a4>] rcu_core+0x574/0x1560 kernel/rcu/tree.c:2377 >>> but there are no more locks to release! >>> >>> other info that might help us debug this: >>> 1 lock held by syz-executor276/8897: >>> #0: ffff88809a3cc0d8 (&type->s_umount_key#40/1){+.+.}, at: >>> alloc_super+0x158/0x910 fs/super.c:229 >>> >>> stack backtrace: >>> CPU: 0 PID: 8897 Comm: syz-executor276 Not tainted 5.4.0-rc3-next-20191015 >>> #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >>> Google 01/01/2011 >>> Call Trace: >>> <IRQ> >>> __dump_stack lib/dump_stack.c:77 [inline] >>> dump_stack+0x172/0x1f0 lib/dump_stack.c:113 >>> print_unlock_imbalance_bug kernel/locking/lockdep.c:4008 [inline] >>> print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3984 >>> __lock_release kernel/locking/lockdep.c:4244 [inline] >>> lock_release+0x5f2/0x960 kernel/locking/lockdep.c:4505 >>> rcu_lock_release include/linux/rcupdate.h:213 [inline] >>> __rcu_reclaim kernel/rcu/rcu.h:223 [inline] >> >> I have little knowledge about this kind of stuff, but after seeing >> the dashboard https://syzkaller.appspot.com/bug?extid=36baa6c2180e959e19b1 >> >> I guess this is highly related with ntfs, and in ntfs_fill_super, it >> has lockdep_off() in ntfs_fill_super... >> >> In detail, commit 90c1cba2b3b3 ("locking/lockdep: Zap lock classes even >> with lock debugging disabled") [1], and free_zapped_rcu.... >> >> static void free_zapped_rcu(struct rcu_head *ch) >> { >> struct pending_free *pf; >> unsigned long flags; >> >> if (WARN_ON_ONCE(ch != &delayed_free.rcu_head)) >> return; >> >> raw_local_irq_save(flags); >> arch_spin_lock(&lockdep_lock); >> current->lockdep_recursion = 1; <--- here >> >> /* closed head */ >> pf = delayed_free.pf + (delayed_free.index ^ 1); >> __free_zapped_classes(pf); >> delayed_free.scheduled = false; >> >> /* >> * If there's anything on the open list, close and start a new callback. >> */ >> call_rcu_zapped(delayed_free.pf + delayed_free.index); >> >> current->lockdep_recursion = 0; >> arch_spin_unlock(&lockdep_lock); >> raw_local_irq_restore(flags); >> } >> >> Completely guess and untest since I am not familar with that, >> but in case of that, Cc related people... >> If I'm wrong, ignore my comments and unintentional noise.... >> >> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=90c1cba2b3b3851c151229f61801919b2904d437 >> >> Thanks, >> Gao Xiang > > > Still happens a lot for the past 10 months: > https://syzkaller.appspot.com/bug?id=0d5bdaf028e4283ad7404609d17e5077f48ff26d Unless one of the NTFS maintainers steps in, should NTFS perhaps be excluded from testing with lockdep enabled? This is what I found in the git log of NTFS: commit 59345374742ee6673c2d04b0fa8c888e881b7209 Author: Ingo Molnar <mingo@xxxxxxx> Date: Mon Jul 3 00:25:18 2006 -0700 [PATCH] lockdep: annotate NTFS locking rules NTFS uses lots of type-opaque objects which acquire their true identity runtime - so the lock validator needs to be helped in a couple of places to figure out object types. Many thanks to Anton Altaparmakov for giving lots of explanations about NTFS locking rules. Has no effect on non-lockdep kernels. Thanks, Bart.