Re: [syzbot] possible deadlock in jbd2_journal_lock_updates

Jan Kara <jack@xxxxxxx> · Fri, 14 Oct 2022 15:25:43 +0200

Hello!

On Fri 14-10-22 08:42:57, Thilo Fromm wrote:
> Just want to make sure this does not get lost - as mentioned earlier,
> reverting 51ae846cff5 leads to a kernel build that does not have this issue.

Yes, I'm aware of this and still cannot quite wrap my head how it could be
given the stacktraces I see :) They do not seem to come anywhere near that
code...

> > Sure, I think this worked fine. It's the buffer lock but right before it we're
> > opening a journal transaction. Symbolized it looks like this:
> > 
> >    ext4_mark_iloc_dirty (include/linux/buffer_head.h:308 fs/ext4/inode.c:5712) ext4
> >    __schedule (kernel/sched/core.c:4994 kernel/sched/core.c:6341)
> >    _raw_spin_lock_irqsave (arch/x86/include/asm/paravirt.h:585 arch/x86/include/asm/qspinlock.h:51 include/asm-generic/qspinlock.h:85 include/linux/spinlock.h:199 include/linux/spinlock_api_smp.h:119 kernel/locking/spinlock.c:162)
> >    __ext4_journal_start_sb (fs/ext4/ext4_jbd2.c:105) ext4
> >    __wait_on_bit_lock (arch/x86/include/asm/bitops.h:214 include/asm-generic/bitops/instrumented-non-atomic.h:135 kernel/sched/wait_bit.c:89)
> >    out_of_line_wait_on_bit_lock (kernel/sched/wait_bit.c:118)
> >    var_wake_function (kernel/sched/wait_bit.c:22)
> >    ext4_xattr_block_set (include/linux/buffer_head.h:391 fs/ext4/xattr.c:2019) ext4
> >    ext4_xattr_set_handle (fs/ext4/xattr.c:2395) ext4
> >    ext4_initxattrs (fs/ext4/xattr_security.c:48) ext4
> >    security_inode_init_security (security/security.c:1114)
> >    ext4_init_acl (fs/ext4/xattr_security.c:38) ext4
> >    __ext4_new_inode (fs/ext4/ialloc.c:1325) ext4
> >    ext4_create (fs/ext4/namei.c:2796) ext4
> >    path_openat (fs/namei.c:3334 fs/namei.c:3404 fs/namei.c:3612)
> >    do_filp_open (fs/namei.c:3642)
> >    vfs_statx (include/linux/namei.h:57 fs/stat.c:221)
> >    __check_object_size (mm/usercopy.c:240 mm/usercopy.c:286 mm/usercopy.c:256)
> >    do_sys_openat2 (fs/open.c:1214)
> >    __x64_sys_openat (fs/open.c:1241)
> >    do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
> >    entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:118)
> 
> Is the symbolised stack trace Jeremi sent helpful to get to the bottom of
> this issue? Can we do anything else to help?

Yes, thanks for the symbolized stacktraces and sorry for the delay. It made
it clear we are hanging on buffer lock. So far I still don't understand the
deadlock scenario (in particular who can be holding the buffer locked)
and I'm busy with something else at SUSE to seriously dwelve into this but
I'll get back to you :).

								Honza

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR