> On Apr 12, 2023, at 3:05 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Wed, 2023-04-12 at 18:03 +0000, Chuck Lever III wrote: >> >> >>> On Apr 10, 2023, at 8:36 PM, kernel test robot <yujie.liu@xxxxxxxxx> >>> wrote: >>> >>> Hello, >>> >>> kernel test robot noticed "WARNING:inconsistent_lock_state" on: >>> >>> commit: 5fd403eb6c181c63a3aacd55d92b80256a0670cf ("shmem: stable >>> directory cookies") >>> git://git.kernel.org/cgit/linux/kernel/git/cel/linux topic-shmem- >>> stable-dir-cookies >>> >>> in testcase: boot >>> >>> compiler: gcc-11 >>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 >>> -m 16G >>> >>> (please refer to attached dmesg/kmsg for entire log/backtrace) >>> >>> >>> If you fix the issue, kindly add following tag >>> | Reported-by: kernel test robot <yujie.liu@xxxxxxxxx> >>> | Link: >>> https://lore.kernel.org/oe-lkp/202304101606.79aea62f-yujie.liu@xxxxxxxxx >>> >>> >>> [ 21.279213][ C0] WARNING: inconsistent lock state >>> [ 21.279668][ C0] 6.3.0-rc5-00001-g5fd403eb6c18 #1 Not tainted >>> [ 21.280199][ C0] -------------------------------- >>> [ 21.280657][ C0] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} >>> usage. >>> [ 21.281238][ C0] swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes: >>> [ 21.281773][ C0] ffff8881102e9b50 (&xa->xa_lock#3){+.?.}-{2:2}, at: >>> xa_destroy (lib/xarray.c:2214) >>> [ 21.283140][ C0] {SOFTIRQ-ON-W} state was registered at: >>> [ 21.283640][ C0] __lock_acquire (kernel/locking/lockdep.c:5010) >>> [ 21.284089][ C0] lock_acquire (kernel/locking/lockdep.c:467 >>> kernel/locking/lockdep.c:5671 kernel/locking/lockdep.c:5634) >>> [ 21.284513][ C0] _raw_spin_lock >>> (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) >>> [ 21.284937][ C0] shmem_doff_add (include/linux/xarray.h:965 >>> mm/shmem.c:2943) >>> [ 21.285375][ C0] shmem_mknod (mm/shmem.c:3014) >>> [ 21.285791][ C0] vfs_mknod (fs/namei.c:3916) >>> [ 21.286195][ C0] devtmpfs_work_loop (drivers/base/devtmpfs.c:228 >>> drivers/base/devtmpfs.c:393 drivers/base/devtmpfs.c:408) >>> [ 21.286653][ C0] devtmpfsd (devtmpfs.c:?) >>> [ 21.287046][ C0] kthread (kernel/kthread.c:376) >>> [ 21.287441][ C0] ret_from_fork (arch/x86/entry/entry_64.S:314) >>> [ 21.287864][ C0] irq event stamp: 167451 >>> [ 21.288264][ C0] hardirqs last enabled at (167450): >>> kasan_quarantine_put (arch/x86/include/asm/irqflags.h:42 >>> (discriminator 1) arch/x86/include/asm/irqflags.h:77 (discriminator >>> 1) arch/x86/include/asm/irqflags.h:135 (discriminator 1) >>> mm/kasan/quarantine.c:242 (discriminator 1)) >>> [ 21.289095][ C0] hardirqs last disabled at (167451): >>> _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:108 >>> kernel/locking/spinlock.c:162) >>> [ 21.289969][ C0] softirqs last enabled at (167330): __do_softirq >>> (kernel/softirq.c:415 kernel/softirq.c:600) >>> [ 21.290755][ C0] softirqs last disabled at (167355): irq_exit_rcu >>> (kernel/softirq.c:445 kernel/softirq.c:650 kernel/softirq.c:640 >>> kernel/softirq.c:662) >>> [ 21.291540][ C0] >>> [ 21.291540][ C0] other info that might help us debug this: >>> [ 21.292230][ C0] Possible unsafe locking scenario: >>> [ 21.292230][ C0] >>> [ 21.292905][ C0] CPU0 >>> [ 21.293235][ C0] ---- >>> [ 21.293575][ C0] lock(&xa->xa_lock#3); >>> [ 21.293987][ C0] <Interrupt> >>> [ 21.294327][ C0] lock(&xa->xa_lock#3); >>> [ 21.294753][ C0] >>> [ 21.294753][ C0] *** DEADLOCK *** >>> [ 21.294753][ C0] >>> [ 21.295483][ C0] 1 lock held by swapper/0/0: >>> [ 21.295914][ C0] #0: ffffffff8597a260 (rcu_callback){....}-{0:0}, >>> at: rcu_do_batch (kernel/rcu/tree.c:2104) >> >> It appears that RCU is trying to evict a tmpfs directory inode >> prematurely. >> lockdep catches this because someone else is trying to add an entry to >> it >> while RCU is trying to free it. Classic use-after-free. >> >> Jeff, the only new iput() in this patch is the one you suggested in >> shmem_symlink(). Are you sure it is needed (and also correct)? >> > > The code in your topic-shmem-stable-dir-cookies branch looks correct to > me. After shmem_get_inode, it holds an inode reference and that must be > explicitly put on error, unless you attach it to the dentry (via > d_instantiate). > > I'm not sure how to interpret this. The log is a bit of a mess. It looks > it ended up in some sort of recursive call into the same xarray due to > an interrupt? I think it's easier to see if you look at the dmesg.xz that was attached to the original report. The thing calling xa_destroy is being invoked from i_callback, which is the RCU-deferred "inode destroy" method. It's running in softIRQ context. > One thing that looks suspicious to me is that this patch has the call to > shmem_doff_map_destroy in free_inode (which is the RCU callback). I > think you probably want to do that in destroy_inode instead since that > involves taking locks and such. I'll have a look! > I'm not sure that's enough to explain how it ended up here though. > >> >>> [ 21.296705][ C0] >>> [ 21.296705][ C0] stack backtrace: >>> [ 21.297254][ C0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted >>> 6.3.0-rc5-00001-g5fd403eb6c18 #1 >>> [ 21.298007][ C0] Hardware name: QEMU Standard PC (i440FX + >>> PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 >>> [ 21.298867][ C0] Call Trace: >>> [ 21.299199][ C0] <IRQ> >>> [ 21.299503][ C0] dump_stack_lvl (lib/dump_stack.c:107) >>> [ 21.299926][ C0] mark_lock_irq (kernel/locking/lockdep.c:232 >>> kernel/locking/lockdep.c:3976 kernel/locking/lockdep.c:4179) >>> [ 21.300362][ C0] ? secondary_startup_64_no_verify >>> (arch/x86/kernel/head_64.S:358) >>> [ 21.300888][ C0] ? __pfx_mark_lock_irq >>> (kernel/locking/lockdep.c:4170) >>> [ 21.301353][ C0] ? __pfx_stack_trace_save (kernel/stacktrace.c:114) >>> [ 21.302659][ C0] ? orc_find+0x23b/0x4d0 >>> [ 21.303106][ C0] ? save_trace (kernel/locking/lockdep.c:586) >>> [ 21.303519][ C0] ? secondary_startup_64_no_verify >>> (arch/x86/kernel/head_64.S:358) >>> [ 21.304046][ C0] ? kernel_text_address (kernel/extable.c:99) >>> [ 21.304507][ C0] mark_lock+0x4b8/0x13a0 >>> [ 21.304959][ C0] ? __pfx_mark_lock+0x10/0x10 >>> [ 21.305445][ C0] mark_usage (kernel/locking/lockdep.c:4530) >>> [ 21.305844][ C0] __lock_acquire (kernel/locking/lockdep.c:5010) >>> [ 21.306285][ C0] ? mark_lock+0xbc/0x13a0 >>> [ 21.306741][ C0] ? kasan_save_stack (mm/kasan/common.c:46) >>> [ 21.307182][ C0] ? __pfx___lock_acquire >>> (kernel/locking/lockdep.c:4914) >>> [ 21.307647][ C0] ? rcu_do_batch (include/linux/rcupdate.h:332 >>> kernel/rcu/tree.c:2114) >>> [ 21.308082][ C0] ? rcu_core (kernel/rcu/tree.c:2374) >>> [ 21.308489][ C0] ? __do_softirq (arch/x86/include/asm/atomic.h:29 >>> include/linux/jump_label.h:260 include/linux/jump_label.h:270 >>> include/trace/events/irq.h:142 kernel/softirq.c:572) >>> [ 21.308922][ C0] ? irq_exit_rcu (kernel/softirq.c:445 >>> kernel/softirq.c:650 kernel/softirq.c:640 kernel/softirq.c:662) >>> [ 21.309353][ C0] lock_acquire (kernel/locking/lockdep.c:467 >>> kernel/locking/lockdep.c:5671 kernel/locking/lockdep.c:5634) >>> [ 21.309771][ C0] ? xa_destroy (lib/xarray.c:2214) >>> [ 21.310185][ C0] ? __pfx_lock_acquire >>> (kernel/locking/lockdep.c:5637) >>> [ 21.310639][ C0] ? __pfx___lock_acquire >>> (kernel/locking/lockdep.c:4914) >>> [ 21.311104][ C0] ? ftrace_likely_update >>> (arch/x86/include/asm/smap.h:56 kernel/trace/trace_branch.c:229) >>> [ 21.311571][ C0] _raw_spin_lock_irqsave >>> (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) >>> [ 21.312040][ C0] ? xa_destroy (lib/xarray.c:2214) >>> [ 21.312457][ C0] xa_destroy (lib/xarray.c:2214) >>> [ 21.312859][ C0] ? lock_acquire (kernel/locking/lockdep.c:467 >>> kernel/locking/lockdep.c:5671 kernel/locking/lockdep.c:5634) >>> [ 21.313290][ C0] ? __pfx_xa_destroy (lib/xarray.c:2208) >>> [ 21.313746][ C0] shmem_free_in_core_inode (mm/shmem.c:2989 >>> mm/shmem.c:4061) >>> [ 21.314231][ C0] ? __pfx_shmem_free_in_core_inode (mm/shmem.c:4058) >>> [ 21.314763][ C0] i_callback (fs/inode.c:252) >>> [ 21.315154][ C0] rcu_do_batch (include/linux/rcupdate.h:332 >>> kernel/rcu/tree.c:2114) >>> [ 21.315580][ C0] ? __pfx_rcu_do_batch (kernel/rcu/tree.c:2051) >>> [ 21.316037][ C0] ? _raw_spin_unlock_irqrestore >>> (arch/x86/include/asm/irqflags.h:42 >>> arch/x86/include/asm/irqflags.h:77 >>> arch/x86/include/asm/irqflags.h:135 >>> include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194) >>> [ 21.316550][ C0] ? rcu_report_qs_rdp (kernel/rcu/tree.c:2002) >>> [ 21.317012][ C0] rcu_core (kernel/rcu/tree.c:2374) >>> [ 21.317414][ C0] __do_softirq (arch/x86/include/asm/atomic.h:29 >>> include/linux/jump_label.h:260 include/linux/jump_label.h:270 >>> include/trace/events/irq.h:142 kernel/softirq.c:572) >>> [ 21.317836][ C0] irq_exit_rcu (kernel/softirq.c:445 >>> kernel/softirq.c:650 kernel/softirq.c:640 kernel/softirq.c:662) >>> [ 21.318251][ C0] sysvec_apic_timer_interrupt >>> (arch/x86/kernel/apic/apic.c:1107 (discriminator 14)) >>> [ 21.318749][ C0] </IRQ> >>> [ 21.319052][ C0] <TASK> >>> [ 21.319355][ C0] asm_sysvec_apic_timer_interrupt >>> (arch/x86/include/asm/idtentry.h:645) >>> [ 21.319880][ C0] RIP: 0010:default_idle >>> (arch/x86/include/asm/irqflags.h:37 >>> arch/x86/include/asm/irqflags.h:72 arch/x86/kernel/process.c:704) >>> [ 21.320344][ C0] Code: ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 >>> 90 90 90 90 f3 0f 1e fa 8b 05 f6 b4 a8 03 85 c0 7e 07 0f 00 2d 8b 96 >>> 3d 00 fb f4 <fa> c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90 90 90 >>> 90 90 90 90 >>> All code >>> ======== >>> 0: ff (bad) >>> 1: ff (bad) >>> 2: ff 90 90 90 90 90 call *-0x6f6f6f70(%rax) >>> 8: 90 nop >>> 9: 90 nop >>> a: 90 nop >>> b: 90 nop >>> c: 90 nop >>> d: 90 nop >>> e: 90 nop >>> f: 90 nop >>> 10: 90 nop >>> 11: 90 nop >>> 12: 90 nop >>> 13: f3 0f 1e fa endbr64 >>> 17: 8b 05 f6 b4 a8 03 mov 0x3a8b4f6(%rip),%eax # >>> 0x3a8b513 >>> 1d: 85 c0 test %eax,%eax >>> 1f: 7e 07 jle 0x28 >>> 21: 0f 00 2d 8b 96 3d 00 verw 0x3d968b(%rip) # 0x3d96b3 >>> 28: fb sti >>> 29: f4 hlt >>> 2a:* fa cli <-- trapping instruction >>> 2b: c3 ret >>> 2c: cc int3 >>> 2d: cc int3 >>> 2e: cc int3 >>> 2f: cc int3 >>> 30: 0f 1f 00 nopl (%rax) >>> 33: 90 nop >>> 34: 90 nop >>> 35: 90 nop >>> 36: 90 nop >>> 37: 90 nop >>> 38: 90 nop >>> 39: 90 nop >>> 3a: 90 nop >>> 3b: 90 nop >>> 3c: 90 nop >>> 3d: 90 nop >>> 3e: 90 nop >>> 3f: 90 nop >>> >>> Code starting with the faulting instruction >>> =========================================== >>> 0: fa cli >>> 1: c3 ret >>> 2: cc int3 >>> 3: cc int3 >>> 4: cc int3 >>> 5: cc int3 >>> 6: 0f 1f 00 nopl (%rax) >>> 9: 90 nop >>> a: 90 nop >>> b: 90 nop >>> c: 90 nop >>> d: 90 nop >>> e: 90 nop >>> f: 90 nop >>> 10: 90 nop >>> 11: 90 nop >>> 12: 90 nop >>> 13: 90 nop >>> 14: 90 nop >>> 15: 90 nop >>> >>> >>> -- >>> 0-DAY CI Kernel Test Service >>> https://github.com/intel/lkp-tests >> >> -- >> Chuck Lever >> >> > > -- > Jeff Layton <jlayton@xxxxxxxxxx> -- Chuck Lever