On 2023/6/16 21:27, Jan Kara wrote: > On Fri 16-06-23 09:55:47, Zhang Yi wrote: >> From: Zhang Yi <yi.zhang@xxxxxxxxxx> >> >> We got a NULL pointer dereference issue below while running generic/475 >> I/O failure pressure test. >> >> BUG: kernel NULL pointer dereference, address: 0000000000000000 >> #PF: supervisor write access in kernel mode >> #PF: error_code(0x0002) - not-present page >> PGD 0 P4D 0 >> Oops: 0002 [#1] PREEMPT SMP PTI >> CPU: 1 PID: 15600 Comm: fsstress Not tainted 6.4.0-rc5-xfstests-00055-gd3ab1bca26b4 #190 >> RIP: 0010:jbd2_journal_set_features+0x13d/0x430 >> ... >> Call Trace: >> <TASK> >> ? __die+0x23/0x60 >> ? page_fault_oops+0xa4/0x170 >> ? exc_page_fault+0x67/0x170 >> ? asm_exc_page_fault+0x26/0x30 >> ? jbd2_journal_set_features+0x13d/0x430 >> jbd2_journal_revoke+0x47/0x1e0 >> __ext4_forget+0xc3/0x1b0 >> ext4_free_blocks+0x214/0x2f0 >> ext4_free_branches+0xeb/0x270 >> ext4_ind_truncate+0x2bf/0x320 >> ext4_truncate+0x1e4/0x490 >> ext4_handle_inode_extension+0x1bd/0x2a0 >> ? iomap_dio_complete+0xaf/0x1d0 >> >> The root cause is the journal super block had been failed to write out >> due to I/O fault injection, it's uptodate bit was cleared by >> end_buffer_write_sync() and didn't reset yet in jbd2_write_superblock(). >> And it raced by journal_get_superblock()->bh_read(), unfortunately, the >> read IO is also failed, so the error handling in >> journal_fail_superblock() unexpectedly clear the journal->j_sb_buffer, >> finally lead to above NULL pointer dereference issue. >> >> If the journal super block had been read and verified, there is no need >> to call bh_read() read it again even if it has been failed to written >> out. So the fix could be simply move buffer_verified(bh) in front of >> bh_read(). Also remove a stale comment left in >> jbd2_journal_check_used_features(). >> >> Fixes: 51bacdba23d8 ("jbd2: factor out journal initialization from journal_get_superblock()") >> Reported-by: Theodore Ts'o <tytso@xxxxxxx> >> Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx> > > This works as a workaround. It is a bit kludgy but for now I guess it is > good enough. Thanks for the fix and feel free to add: > Thanks for the review. Yes, I suppose it's better to find a way to adjust the sequence of journal load and feature checking in ocfs2_check_volume(), so that we could completely remove the journal_get_superblock() in jbd2_journal_check_used_features(). Thanks, Yi. > > >> --- >> v1->v2: >> - Remove a stale comment left in jbd2_journal_check_used_features(). >> >> fs/jbd2/journal.c | 7 +++---- >> 1 file changed, 3 insertions(+), 4 deletions(-) >> >> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c >> index b5e57735ab3f..559242df0f9a 100644 >> --- a/fs/jbd2/journal.c >> +++ b/fs/jbd2/journal.c >> @@ -1919,6 +1919,9 @@ static int journal_get_superblock(journal_t *journal) >> bh = journal->j_sb_buffer; >> >> J_ASSERT(bh != NULL); >> + if (buffer_verified(bh)) >> + return 0; >> + >> err = bh_read(bh, 0); >> if (err < 0) { >> printk(KERN_ERR >> @@ -1926,9 +1929,6 @@ static int journal_get_superblock(journal_t *journal) >> goto out; >> } >> >> - if (buffer_verified(bh)) >> - return 0; >> - >> sb = journal->j_superblock; >> >> err = -EINVAL; >> @@ -2229,7 +2229,6 @@ int jbd2_journal_check_used_features(journal_t *journal, unsigned long compat, >> >> if (!compat && !ro && !incompat) >> return 1; >> - /* Load journal superblock if it is not loaded yet. */ >> if (journal_get_superblock(journal)) >> return 0; >> if (!jbd2_format_support_feature(journal)) >> -- >> 2.39.2 >>