Re: [cel:topic-shmem-stable-dir-cookies] [shmem] 5fd403eb6c: WARNING:inconsistent_lock_state

Jeff Layton <jlayton@xxxxxxxxxx> · Wed, 12 Apr 2023 15:34:33 -0400

On Wed, 2023-04-12 at 19:09 +0000, Chuck Lever III wrote:
> 
> > On Apr 12, 2023, at 3:05 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > 
> > On Wed, 2023-04-12 at 18:03 +0000, Chuck Lever III wrote:
> > > 
> > > 
> > > > On Apr 10, 2023, at 8:36 PM, kernel test robot <yujie.liu@xxxxxxxxx>
> > > > wrote:
> > > > 
> > > > Hello,
> > > > 
> > > > kernel test robot noticed "WARNING:inconsistent_lock_state" on:
> > > > 
> > > > commit: 5fd403eb6c181c63a3aacd55d92b80256a0670cf ("shmem: stable
> > > > directory cookies")
> > > > git://git.kernel.org/cgit/linux/kernel/git/cel/linux topic-shmem-
> > > > stable-dir-cookies
> > > > 
> > > > in testcase: boot
> > > > 
> > > > compiler: gcc-11
> > > > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2
> > > > -m 16G
> > > > 
> > > > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > > > 
> > > > 
> > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kernel test robot <yujie.liu@xxxxxxxxx>
> > > > > Link:
> > > > https://lore.kernel.org/oe-lkp/202304101606.79aea62f-yujie.liu@xxxxxxxxx
> > > > 
> > > > 
> > > > [   21.279213][    C0] WARNING: inconsistent lock state
> > > > [   21.279668][    C0] 6.3.0-rc5-00001-g5fd403eb6c18 #1 Not tainted
> > > > [   21.280199][    C0] --------------------------------
> > > > [   21.280657][    C0] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W}
> > > > usage.
> > > > [   21.281238][    C0] swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> > > > [ 21.281773][ C0] ffff8881102e9b50 (&xa->xa_lock#3){+.?.}-{2:2}, at:
> > > > xa_destroy (lib/xarray.c:2214)
> > > > [   21.283140][    C0] {SOFTIRQ-ON-W} state was registered at:
> > > > [ 21.283640][ C0] __lock_acquire (kernel/locking/lockdep.c:5010) 
> > > > [ 21.284089][ C0] lock_acquire (kernel/locking/lockdep.c:467
> > > > kernel/locking/lockdep.c:5671 kernel/locking/lockdep.c:5634)
> > > > [ 21.284513][ C0] _raw_spin_lock
> > > > (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
> > > > [ 21.284937][ C0] shmem_doff_add (include/linux/xarray.h:965
> > > > mm/shmem.c:2943) 
> > > > [ 21.285375][ C0] shmem_mknod (mm/shmem.c:3014) 
> > > > [ 21.285791][ C0] vfs_mknod (fs/namei.c:3916) 
> > > > [ 21.286195][ C0] devtmpfs_work_loop (drivers/base/devtmpfs.c:228
> > > > drivers/base/devtmpfs.c:393 drivers/base/devtmpfs.c:408)
> > > > [ 21.286653][ C0] devtmpfsd (devtmpfs.c:?) 
> > > > [ 21.287046][ C0] kthread (kernel/kthread.c:376) 
> > > > [ 21.287441][ C0] ret_from_fork (arch/x86/entry/entry_64.S:314) 
> > > > [   21.287864][    C0] irq event stamp: 167451
> > > > [ 21.288264][ C0] hardirqs last enabled at (167450):
> > > > kasan_quarantine_put (arch/x86/include/asm/irqflags.h:42
> > > > (discriminator 1) arch/x86/include/asm/irqflags.h:77 (discriminator
> > > > 1) arch/x86/include/asm/irqflags.h:135 (discriminator 1)
> > > > mm/kasan/quarantine.c:242 (discriminator 1)) 
> > > > [ 21.289095][ C0] hardirqs last disabled at (167451):
> > > > _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:108
> > > > kernel/locking/spinlock.c:162)
> > > > [ 21.289969][ C0] softirqs last enabled at (167330): __do_softirq
> > > > (kernel/softirq.c:415 kernel/softirq.c:600)
> > > > [ 21.290755][ C0] softirqs last disabled at (167355): irq_exit_rcu
> > > > (kernel/softirq.c:445 kernel/softirq.c:650 kernel/softirq.c:640
> > > > kernel/softirq.c:662)
> > > > [   21.291540][    C0]
> > > > [   21.291540][    C0] other info that might help us debug this:
> > > > [   21.292230][    C0]  Possible unsafe locking scenario:
> > > > [   21.292230][    C0]
> > > > [   21.292905][    C0]        CPU0
> > > > [   21.293235][    C0]        ----
> > > > [   21.293575][    C0]   lock(&xa->xa_lock#3);
> > > > [   21.293987][    C0]   <Interrupt>
> > > > [   21.294327][    C0]     lock(&xa->xa_lock#3);
> > > > [   21.294753][    C0]
> > > > [   21.294753][    C0]  *** DEADLOCK ***
> > > > [   21.294753][    C0]
> > > > [   21.295483][    C0] 1 lock held by swapper/0/0:
> > > > [ 21.295914][ C0] #0: ffffffff8597a260 (rcu_callback){....}-{0:0},
> > > > at: rcu_do_batch (kernel/rcu/tree.c:2104)
> > > 
> > > It appears that RCU is trying to evict a tmpfs directory inode
> > > prematurely.
> > > lockdep catches this because someone else is trying to add an entry to
> > > it
> > > while RCU is trying to free it. Classic use-after-free.
> > > 
> > > Jeff, the only new iput() in this patch is the one you suggested in
> > > shmem_symlink(). Are you sure it is needed (and also correct)?
> > > 
> > 
> > The code in your topic-shmem-stable-dir-cookies branch looks correct to
> > me. After shmem_get_inode, it holds an inode reference and that must be
> > explicitly put on error, unless you attach it to the dentry (via
> > d_instantiate).
> > 
> > I'm not sure how to interpret this. The log is a bit of a mess. It looks
> > it ended up in some sort of recursive call into the same xarray due to
> > an interrupt?
> 
> I think it's easier to see if you look at the dmesg.xz that was
> attached to the original report.
> 
> The thing calling xa_destroy is being invoked from i_callback,
> which is the RCU-deferred "inode destroy" method. It's running
> in softIRQ context.
> 

Right, but why is it trying to add an entry to an xarray that is being
destroyed? Or maybe it isn't, and lockdep is just confused and is
classifying the various per-inode xarrays together? I have a hard time
interpreting these reports sometimes. :-/

> 
> > One thing that looks suspicious to me is that this patch has the call to
> > shmem_doff_map_destroy in free_inode (which is the RCU callback). I
> > think you probably want to do that in destroy_inode instead since that
> > involves taking locks and such.
> 
> I'll have a look!
> 

Cool, I think that's probably safest here. In principle, the xarray
should be empty when we get to this point so there ought not be much to
do anyway.

> 
> > I'm not sure that's enough to explain how it ended up here though.
> > 
> > 

-- 
Jeff Layton <jlayton@xxxxxxxxxx>