Presently we always BUG_ON if trying to start a transaction on a journal marked with JBD2_UNMOUNT, since this should never happen. However while running stress tests it was observed that in case of some error handling paths, it is possible for update_super_work to start a transaction after the journal is destroyed eg: (umount) ext4_kill_sb kill_block_super generic_shutdown_super sync_filesystem /* commits all txns */ evict_inodes /* might start a new txn */ ext4_put_super flush_work(&sbi->s_sb_upd_work) /* flush the workqueue */ jbd2_journal_destroy journal_kill_thread journal->j_flags |= JBD2_UNMOUNT; jbd2_journal_commit_transaction jbd2_journal_get_descriptor_buffer jbd2_journal_bmap ext4_journal_bmap ext4_map_blocks ... ext4_inode_error ext4_handle_error schedule_work(&sbi->s_sb_upd_work) /* work queue kicks in */ update_super_work jbd2_journal_start start_this_handle BUG_ON(journal->j_flags & JBD2_UNMOUNT) Hence, make sure we only defer the update of ext4 sb if the sb is still active. Otherwise, just fallback to an un-journaled commit. The important thing to note here is that we must only defer sb update if we have not yet flushed the s_sb_update_work queue in umount path else this race can be hit (point 1 below). Since we don't have a direct way to check for that we use SB_ACTIVE instead. The SB_ACTIVE check is a bit subtle so adding some notes below for future reference: 1. Ideally we would want to have a something like (flags & JBD2_UNMOUNT == 0) however this is not correct since we could end up scheduling work after it has been flushed: ext4_put_super flush_work(&sbi->s_sb_upd_work) **kjournald2** jbd2_journal_commit_transaction ... ext4_inode_error /* JBD2_UNMOUNT not set */ schedule_work(s_sb_upd_work) jbd2_journal_destroy journal->j_flags |= JBD2_UNMOUNT; **workqueue** update_super_work jbd2_journal_start start_this_handle BUG_ON(JBD2_UNMOUNT) Something like the above doesn't happen with SB_ACTIVE check because we are sure that the workqueue would be flushed at a later point if we are in the umount path. 2. We don't need a similar check in ext4_grp_locked_error since it is only called from mballoc and AFAICT it would be always valid to schedule work here. Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available") Reported-by: Mahesh Kumar <maheshkumar657g@xxxxxxxxx> Suggested-by: Ritesh Harjani <ritesh.list@xxxxxxxxx> Signed-off-by: Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx> --- fs/ext4/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index a963ffda692a..b7341e9acf62 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -706,7 +706,7 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error, * constraints, it may not be safe to do it right here so we * defer superblock flushing to a workqueue. */ - if (continue_fs && journal) + if (continue_fs && journal && (sb->s_flags & SB_ACTIVE)) schedule_work(&EXT4_SB(sb)->s_sb_upd_work); else ext4_commit_super(sb); -- 2.48.1