Hi Ted,
On 1/14/25 21:38, Theodore Ts'o wrote:
On Tue, Jan 14, 2025 at 02:25:21PM +0800, Heming Zhao wrote:
The root cause appears to be that the jbd2 bypass recovery logic
is incorrect.
Heming, thanks for taking a look.
I'm not convinced the root cause is what you've stated. When
jbd2_journal_wipe() calls jbd2_mark_journal_empty(), s_start gets set
to zero:
Actually, ocfs2 calls jbd2_journal_wipe() with 'write=0' (hard coded),
so jbd2_mark_journal_empty() isn't called during the ocfs2 mount
phase. This means the following deduction won't apply in this case.
-- Heming
sb->s_start = cpu_to_be32(0);
This then gets checked in jbd2_journal_recovery:
if (!sb->s_start) {
jbd2_debug(1, "No recovery required, last transaction %d, head block %u\n",
be32_to_cpu(sb->s_sequence), be32_to_cpu(sb->s_head));
journal->j_transaction_sequence = be32_to_cpu(sb->s_sequence) + 1;
journal->j_head = be32_to_cpu(sb->s_head);
return 0;
}
I suspect that there is something else wrong with jbd2's superblock,
since this normally works in the absence of malicious fs image
fuzzing, such that when jbd2_journal_load() calls reset_journal()
after jbd2_journal_recover() correctly bypasses recovery, the WARN_ON
gets triggered.
I'd suggest that you enable jbd2 debugging so we can see all of the
jbd2_debug() message to understand what might be going on.
By the way, given that this is only a WARN_ON, and it involves
malicious image fuzzing, this is probably a valid jbd2 bug, but it's
not actually a security bug. Sure, someone silly enough to pick up a
maliciously corrupted USB thumb drive dropped in a parking lot and
insert it into their desktop, and the distribution is silly enoough to
allow automount, the worse that can happen is that the system to
reboot if the system is configured to panic on a WARNING. So feel
free to prioritize your investigation appropriately. :-)
Cheers,
- Ted