On Mon, Aug 26, 2024 at 01:22:54AM -0300, Vinicius Peixoto wrote: > Since the disk data is bogus, journal_reset fails with -EINVAL ("JBD2: > Journal too short (blocks 2-1024)"); this leaves journal->j_head == NULL. > However, jbd2_journal_load clears the JBD2_ABORT flag right before calling > journal_reset. This leads to a problem later when ofcs2_mount_volume tries > to flush the journal as part of the cleanup when aborting the mount > operation: > > -> ofcs2_mount_volume (error; goto out_system_inodes) > -> ofcs2_journal_shutdown > -> jbd2_journal_flush > -> jbd2_cleanup_journal_tail (J_ASSERT fails) > ... > > I confirmed that setting the JBD2_ABORT flag in journal_reset before > returning -EINVAL fixes the problem: > > static int journal_reset(journal_t *journal) > journal_fail_superblock(journal); > + journal->j_flags |= JBD2_ABORT; > return -EINVAL; > > You can find a proper patch file + the syzbot re-test result in [1]. > However, I'm not entirely sure whether this is the correct decision, and I > wanted to confirm that this is an appropriate solution before sending a > proper patch to the mailing list. The reason why this isn't an issue with ext4 is because the normal "right" way to do this is if jbd2_journal_load() returns an error, is to call jbd2_journal_destroy() to release the data structures with the journal --- and then don't try to use the journal afterwards. The weird thing is that there are two code paths in ocfs2 that calls jbd2_journal_load(). One is in ocfs2_replay_journal() which does what ext4 does. The other is ocfs2_load_journal() which does *not* do this, and this is the one which you saw in the syzkaller reproducer. It looks like one codepath is used to replay the ocfs2 for some other node, and the is to load (and presumably later, replay) the journal for the mounting node. There are also some other things which look very confusing, such as the fact that ocfs2_journal_shutdown calls igrab: /* need to inc inode use count - jbd2_journal_destroy will iput. */ if (!igrab(inode)) BUG(); ... which is *weird*. Normaly the journal inode refcount is expected to be bumped before calling jbd2_journal_init_inode() --- which normally is done by an iget() function, and is needed to make sure the journal inode isn't released out from under the jbd2 layer. It looks like ocfs2 uses the journal inode for other stuff so get the journal inode without calling something like iget(). Which is OK, I guess, but it means that there are a whole bunch of places where it has to call !BUG_ON(igrab(journal->j_inode) before calling jbd2_journal_destroy(). It would seem like the *right* thing to do is to bump the refcount in ocfs2_journal_init(), and if for some reason the igrab fails, it can just return an error early, instead of having to resort to BUG_ON() later, and if you don't realize that you have to do this weird igrab() before calling jbd2_journal_destroy(), you'll end up leaking the journal inode. Anyway, I think the right thing to do is to restructure how ocfs2 calls the jbd2 journal, but that's going to require a more careful examination of the code paths. Your patch is a bit of a blunt force hack that should be harmless, but it's probably not the best way to fix the problem, although doing it "right" would be more complicated.... - Ted