On Mon 24-06-24 18:53:50, Jan Kara wrote: > On Mon 24-06-24 11:26:58, Theodore Ts'o wrote: > > On Sun, Jun 23, 2024 at 06:57:13PM -0700, Alexander Coffin wrote: > > > [1.] One line summary of the problem: > > > Using resize2fs on-line resizing on a specific ext4 partition is > > > causing an Oops. > > > > > > > > > [6.] Output of Oops.. message (if applicable) with symbolic information > > > resolved (see Documentation/admin-guide/bug-hunting.rst) > > > > > > ``` > > > [ 445.552287] ------------[ cut here ]------------ > > > [ 445.552300] kernel BUG at fs/jbd2/journal.c:846! > > > > Thanks for the bug report. The BUG_ON is from the following assert in > > jbd2_journal_next_log_block: > > > > J_ASSERT(journal->j_free > 1); > > > > and it indicates that we ran out of space in the journal. There are > > mechanisms to make sure that this should never happen, and if the > > journal is too small and the transaction couldn't be broken up, then > > the operation (whether it is a resize or a file truncate or some other > > operation) should have errored out, and not triggered a BUG. > > Yeah, I was debugging this today and I'll shortly send a fix for JBD2 so > that we don't trigger this BUG. But the online resize will fail anyway > after my fixes (just gracefully) because the add_flex_bg() code tries to > start a transaction with more credits than the journal allows. To be more precise, the problem is that with this size of the journal, maximum transaction size is 250 metadata blocks (+6 blocks reserved for descriptors). Online resizing tries to start a transaction with 252 credits in ext4_flex_group_add(). 246 credits come from es->s_reserved_gdt_blocks so I don't see an easy way how to avoid that because to each of these reserve gdt blocks we need to add reserved gdt blocks from the new groups. So I see two possibilities: 1) Just make mke2fs / tune2fs refuse so many reserved gdt blocks with a tiny journal. 2) Allow larger transaction size - currently we require that 4 max sized transactions fit into the journal, we could reduce it to 3 without introducing deadlocks. But larger transactions could have other unexpected performance side effects so I'm not sure the risk is worth it for a corner case like this. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR