On Fri 12-07-24 18:38:21, harshad shirwadkar wrote: > On Fri, Jun 28, 2024 at 6:43 AM Jan Kara <jack@xxxxxxx> wrote: > > > +static void __jbd2_journal_lock_updates(journal_t *journal, bool wait_on_rsv) > > > { > > > jbd2_might_wait_for_commit(journal); > > > > > > write_lock(&journal->j_state_lock); > > > ++journal->j_barrier_count; > > > > > > - /* Wait until there are no reserved handles */ > > > - if (atomic_read(&journal->j_reserved_credits)) { > > > + if (wait_on_rsv && atomic_read(&journal->j_reserved_credits)) { > > > + /* Wait until there are no reserved handles */ > > > > So it is not as simple as this. start_this_handle() ignores > > journal->j_barrier_count for reserved handles so they would happily start > > while you have the journal locked with jbd2_journal_lock_updates_no_rsv() > > and then writeback code could mess with your fastcommit state. Or perhaps I > > miss some subtlety why this is fine - but that then deserves a good > > explanation in a comment or maybe a different API because currently > > jbd2_journal_lock_updates_no_rsv() doesn't do what one would naively > > expect. > > AFAICT, jbd2_journal_commit_transaction() only calls > jbd2_journal_wait_updates(journal) which waits for > trasaction->t_updates to reach 0. But it doesn't wait for > journal->j_reserved_credits to reach 0. I saw a performance > improvement by removing waiting on reserved handles in FC commit code > as well. Given that JBD2 doesn't wait, I (perhaps incorrectly) thought > that it'd be okay to not wait in FC as well. Could you help me > understand why the JBD2 journal commit doesn't need to wait for > reserved handles? Sure. When we do page writeback, we may need to do some metadata modifications (such as clearing unwritten bits in the extent tree) before the writeback can complete and we can clear PG_Writeback bits. Hence if we started a normal transaction after IO completes to do metadata modifications, we could easily deadlock with transaction commit - broadly speaking transaction commit waits for PG_Writeback to clear, page writeback code waits for transaction commit so that it can free space in the journal and start a new transaction. This is why reserved transactions were introduced. Their main point is: If you have handle reserved, you are guaranteed you can start this handle without blocking waiting for space in the journal. Note that we could also start a normal handle before doing page writeback and use it after IO completion for metadata changes and this would work in principle (besides the technical troubles with propagating the handle to IO completion context) but because we can writeback quite large chunks of data, these handles could be running for tens of seconds which would make other filesystem operations starve. Thus we allow reserved (but not yet started) handles to be moved from currently running transaction into a next one and this is fine because the reservation code makes sure there's always enough free space in the journal for reserved handles. So commit code can happily do transaction commit while reserved handles are existing because if their owner decides to start them, they'll just join the currently running transaction (or create one if there's none). But jbd2_journal_lock_updates() always needs to wait for all outstanding handles including the reserved ones because it needs to guarantee there are no modifications pending for the journal. Even you in fastcommit code rely on inode list not being modified after jbd2_journal_lock_updates() and reserved handles would break that assumption because existing reserved handles can start after your version of jbd2_journal_lock_updates() not waiting for reserved handles would return. And reserved handles need to be able to start while jbd2_journal_lock_updates() is waiting for the journal to quiesce because we need page writeback to finish before we can quiesce the journal. In theory you could create new journal locking mechanism for fastcommit code that would *also* block starting of reserved handles since what fastcommit needs to do with a locked journal does not depend on page writeback. But frankly I'm not convinced this complication is worth it. I hope this makes things clearer. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR