On Tue, Jul 16 2024, Jan Kara wrote: > On Fri 12-07-24 10:53:02, Luis Henriques wrote: >> Since there's code (in fast-commit) that already handles a '0' tid as a >> special case, it's better to ensure that jbd2 never sets it to that value >> when journal->j_transaction_sequence increment wraps. >> >> Suggested-by: Andreas Dilger <adilger@xxxxxxxxx> >> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@xxxxxxxxx> > > Well, sadly it isn't so simple. If nothing else, journal replay > (do_one_pass()) will get broken by the skipped tid as we do check: > > if (sequence != next_commit_ID) { > brelse(bh); > break; > } > > So we'd abort journal replay too early. Secondly, there's also code > handling journal replay in libext2fs which would need to be checked and > fixed up. Finally, I've found code in mballoc which alternates between two > lists based on tid & 1, so this logic would get broken by skipping 0 tid > as well. > > Overall, I think we might be better off to go and fix places that assume > tid 0 is not valid. I can see those assumptions in: > > ext4_fc_mark_ineligible() > ext4_wait_for_tail_page_commit() > __jbd2_log_wait_for_space() > jbd2_journal_shrink_checkpoint_list() > > Now I don't see it as urgent to fix all these right now. Just for this > series let's not add another place making tid 0 special. Later we can fixup > the other places... Yikes! Looks like I haven't done my homework -- I should have caught at least one of the three breakages you point out. Obviously, because I've seen this assumption in different places, I thought it would be OK. Anyway, thanks a lot for point this out, Jan. I'll add a new TODO to my list to start looking at other places that need to be fixed. Cheers, -- Luís