Re: [syzbot] possible deadlock in jbd2_journal_lock_updates

Jan Kara <jack@xxxxxxx> · Thu, 10 Nov 2022 16:26:37 +0100

On Thu 10-11-22 04:57:58, Jeremi Piotrowski wrote:
> On Wed, Oct 26, 2022 at 12:18:54PM +0200, Jan Kara wrote:
> > On Mon 24-10-22 18:32:51, Thilo Fromm wrote:
> > > Hello Honza,
> > > 
> > > > Yeah, I was pondering about this for some time but still I have no clue who
> > > > could be holding the buffer lock (which blocks the task holding the
> > > > transaction open) or how this could related to the commit you have
> > > > identified. I have two things to try:
> > > > 
> > > > 1) Can you please check whether the deadlock reproduces also with 6.0
> > > > kernel? The thing is that xattr handling code in ext4 has there some
> > > > additional changes, commit 307af6c8793 ("mbcache: automatically delete
> > > > entries from cache on freeing") in particular.
> > > 
> > > This would be complex; we currently do not integrate 6.0 with Flatcar and
> > > would need to spend quite some effort ingesting it first (mostly, make sure
> > > the new kernel does not break something unrelated). Flatcar is an
> > > image-based distro, so kernel updates imply full distro updates.
> > 
> > OK, understood.
> > 
> > > > 2) I have created a debug patch (against 5.15.x stable kernel). Can you
> > > > please reproduce the failure with it and post the output of "echo w
> > > > > /proc/sysrq-trigger" and also the output the debug patch will put into the
> > > > kernel log? It will dump the information about buffer lock owner if we > cannot get the lock for more than 32 seconds.
> > > 
> > > This would be more straightforward - I can reach out to one of our users
> > > suffering from the issue; they can reliably reproduce it and don't shy away
> > > from patching their kernel. Where can I find the patch?
> > 
> > Ha, my bad. I forgot to attach it. Here it is.
> > 
> 
> Unfortunately this patch produced no output, but I have been able to repro so I
> understand why: except for the hung tasks, we have 1+ tasks busy-looping through
> the following code in ext4_xattr_block_set():
> 
> inserted:
>         if (!IS_LAST_ENTRY(s->first)) {
>                 new_bh = ext4_xattr_block_cache_find(inode, header(s->base),
>                                                      &ce);
>                 if (new_bh) {
>                         /* We found an identical block in the cache. */
>                         if (new_bh == bs->bh)
>                                 ea_bdebug(new_bh, "keeping");
>                         else {
>                                 u32 ref;
> 
>                                 WARN_ON_ONCE(dquot_initialize_needed(inode));
> 
>                                 /* The old block is released after updating
>                                    the inode. */
>                                 error = dquot_alloc_block(inode,
>                                                 EXT4_C2B(EXT4_SB(sb), 1));
>                                 if (error)
>                                         goto cleanup;
>                                 BUFFER_TRACE(new_bh, "get_write_access");
>                                 error = ext4_journal_get_write_access(
>                                                 handle, sb, new_bh,
>                                                 EXT4_JTR_NONE);
>                                 if (error)
>                                         goto cleanup_dquot;
>                                 lock_buffer(new_bh);
>                                 /*
>                                  * We have to be careful about races with
>                                  * adding references to xattr block. Once we
>                                  * hold buffer lock xattr block's state is
>                                  * stable so we can check the additional
>                                  * reference fits.
>                                  */
>                                 ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1;
>                                 if (ref > EXT4_XATTR_REFCOUNT_MAX) {
>                                         /*
>                                          * Undo everything and check mbcache
>                                          * again.
>                                          */
>                                         unlock_buffer(new_bh);
>                                         dquot_free_block(inode,
>                                                          EXT4_C2B(EXT4_SB(sb),
>                                                                   1));
>                                         brelse(new_bh);
>                                         mb_cache_entry_put(ea_block_cache, ce);
>                                         ce = NULL;
>                                         new_bh = NULL;
>                                         goto inserted;
>                                 }
> 
> The tasks keep taking the 'goto inserted' branch, and never finish. I've been
> able to repro with kernel v6.0.7 as well.

Interesting! That makes is much clearer (and also makes my debug patch
unnecessary). So clearly the e_reusable variable in the mb_cache_entry got
out of sync with the number of references really in the xattr block - in
particular the block likely has h_refcount >= EXT4_XATTR_REFCOUNT_MAX but
e_reusable is set to true. Now I can see how e_reusable can stay at false due
to a race when refcount is actually smaller but I don't see how it could
stay at true when refcount is big enough - that part seems to be locked
properly. If you can reproduce reasonably easily, can you try reproducing
with attached patch? Thanks!

								Honza

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
>From eb426ff2e678925781cb6804898a321e7e83b433 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@xxxxxxx>
Date: Thu, 10 Nov 2022 16:22:06 +0100
Subject: [PATCH] ext4: Debug xattr refcount

---
 fs/ext4/xattr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 36d6ba7190b6..ae7d9743f800 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -2026,6 +2026,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
 				 */
 				ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1;
 				if (ref > EXT4_XATTR_REFCOUNT_MAX) {
+					printk("ce %p ref %u reusable %d\n", ce, ref, (int)ce->e_reusable);
 					/*
 					 * Undo everything and check mbcache
 					 * again.
-- 
2.35.3