On Sat 11-06-22 21:04:26, Zhang Yi wrote: > We catch an assert problem in jbd2_journal_commit_transaction() when > doing fsstress and request falut injection tests. The problem is > happened in a race condition between jbd2_journal_commit_transaction() > and ext4_end_io_end(). Firstly, ext4_writepages() writeback dirty pages > and start reserved handle, and then the journal was aborted due to some > previous metadata IO error, jbd2_journal_abort() start to commit current > running transaction, the committing procedure could be raced by > ext4_end_io_end() and lead to subtract j_reserved_credits twice from > commit_transaction->t_outstanding_credits, finally the > t_outstanding_credits is mistakenly smaller than t_nr_buffers and > trigger assert. > > kjournald2 kworker > > jbd2_journal_commit_transaction() > write_unlock(&journal->j_state_lock); > atomic_sub(j_reserved_credits, t_outstanding_credits); //sub once > > jbd2_journal_start_reserved() > start_this_handle() //detect aborted journal > jbd2_journal_free_reserved() //get running transaction > read_lock(&journal->j_state_lock) > __jbd2_journal_unreserve_handle() > atomic_sub(j_reserved_credits, t_outstanding_credits); > //sub again > read_unlock(&journal->j_state_lock); > > journal->j_running_transaction = NULL; > J_ASSERT(t_nr_buffers <= t_outstanding_credits) //bomb!!! > > Fix this issue by using journal->j_state_lock to protect the subtraction > in jbd2_journal_commit_transaction(). > > Fixes: 96f1e0974575 ("jbd2: avoid long hold times of j_state_lock while committing a transaction") > Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx> Thanks for the analysis and the fix! This is indeed subtle. This fix looks good to me. Feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > --- > fs/jbd2/commit.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > index eb315e81f1a6..af1a9191368c 100644 > --- a/fs/jbd2/commit.c > +++ b/fs/jbd2/commit.c > @@ -553,13 +553,13 @@ void jbd2_journal_commit_transaction(journal_t *journal) > */ > jbd2_journal_switch_revoke_table(journal); > > + write_lock(&journal->j_state_lock); > /* > * Reserved credits cannot be claimed anymore, free them > */ > atomic_sub(atomic_read(&journal->j_reserved_credits), > &commit_transaction->t_outstanding_credits); > > - write_lock(&journal->j_state_lock); > trace_jbd2_commit_flushing(journal, commit_transaction); > stats.run.rs_flushing = jiffies; > stats.run.rs_locked = jbd2_time_diff(stats.run.rs_locked, > -- > 2.31.1 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR