On Thu 08-11-18 13:29:38, Jan Kara wrote: > We can hold j_state_lock for writing at the beginning of > jbd2_journal_commit_transaction() for a rather long time (reportedly for > 30 ms) due cleaning revoke bits of all revoked buffers under it. The > handling of revoke tables as well as cleaning of t_reserved_list, and > checkpoint lists does not need j_state_lock for anything. It is only > needed to prevent new handles from joining the transaction. Generally > T_LOCKED transaction state prevents new handles from joining the > transaction - except for reserved handles which have to allowed to join > while we wait for other handles to complete. > > To prevent reserved handles from joining the transaction while cleaning > up lists, add new transaction state T_SWITCH and watch for it when > starting reserved handles. With this we can just drop the lock for > operations that don't need it. > > Reported-and-tested-by: Adrian Hunter <adrian.hunter@xxxxxxxxx> > Suggested-by: "Theodore Y. Ts'o" <tytso@xxxxxxx> > Signed-off-by: Jan Kara <jack@xxxxxxx> Ping? Honza > --- > fs/jbd2/commit.c | 3 +++ > fs/jbd2/transaction.c | 43 ++++++++++++++++++++++++++++++++++++++----- > include/linux/jbd2.h | 1 + > 3 files changed, 42 insertions(+), 5 deletions(-) > > Changes since v1: > - fixed handling of reserved handles > - rebased on top of 4.20-rc1 > - tested dioread-nolock xfstest run > > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > index 150cc030b4d7..2eb55c3361a8 100644 > --- a/fs/jbd2/commit.c > +++ b/fs/jbd2/commit.c > @@ -439,6 +439,8 @@ void jbd2_journal_commit_transaction(journal_t *journal) > finish_wait(&journal->j_wait_updates, &wait); > } > spin_unlock(&commit_transaction->t_handle_lock); > + commit_transaction->t_state = T_SWITCH; > + write_unlock(&journal->j_state_lock); > > J_ASSERT (atomic_read(&commit_transaction->t_outstanding_credits) <= > journal->j_max_transaction_buffers); > @@ -505,6 +507,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) > atomic_sub(atomic_read(&journal->j_reserved_credits), > &commit_transaction->t_outstanding_credits); > > + write_lock(&journal->j_state_lock); > trace_jbd2_commit_flushing(journal, commit_transaction); > stats.run.rs_flushing = jiffies; > stats.run.rs_locked = jbd2_time_diff(stats.run.rs_locked, > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c > index c0b66a7a795b..116d8251fbff 100644 > --- a/fs/jbd2/transaction.c > +++ b/fs/jbd2/transaction.c > @@ -138,9 +138,9 @@ static inline void update_t_max_wait(transaction_t *transaction, > } > > /* > - * Wait until running transaction passes T_LOCKED state. Also starts the commit > - * if needed. The function expects running transaction to exist and releases > - * j_state_lock. > + * Wait until running transaction passes to T_FLUSH state and new transaction > + * can thus be started. Also starts the commit if needed. The function expects > + * running transaction to exist and releases j_state_lock. > */ > static void wait_transaction_locked(journal_t *journal) > __releases(journal->j_state_lock) > @@ -160,6 +160,32 @@ static void wait_transaction_locked(journal_t *journal) > finish_wait(&journal->j_wait_transaction_locked, &wait); > } > > +/* > + * Wait until running transaction transitions from T_SWITCH to T_FLUSH > + * state and new transaction can thus be started. The function releases > + * j_state_lock. > + */ > +static void wait_transaction_switching(journal_t *journal) > + __releases(journal->j_state_lock) > +{ > + DEFINE_WAIT(wait); > + > + if (WARN_ON(!journal->j_running_transaction || > + journal->j_running_transaction->t_state != T_SWITCH)) > + return; > + prepare_to_wait(&journal->j_wait_transaction_locked, &wait, > + TASK_UNINTERRUPTIBLE); > + read_unlock(&journal->j_state_lock); > + /* > + * We don't call jbd2_might_wait_for_commit() here as there's no > + * waiting for outstanding handles happening anymore in T_SWITCH state > + * and handling of reserved handles actually relies on that for > + * correctness. > + */ > + schedule(); > + finish_wait(&journal->j_wait_transaction_locked, &wait); > +} > + > static void sub_reserved_credits(journal_t *journal, int blocks) > { > atomic_sub(blocks, &journal->j_reserved_credits); > @@ -183,7 +209,8 @@ static int add_transaction_credits(journal_t *journal, int blocks, > * If the current transaction is locked down for commit, wait > * for the lock to be released. > */ > - if (t->t_state == T_LOCKED) { > + if (t->t_state != T_RUNNING) { > + WARN_ON_ONCE(t->t_state >= T_FLUSH); > wait_transaction_locked(journal); > return 1; > } > @@ -360,8 +387,14 @@ static int start_this_handle(journal_t *journal, handle_t *handle, > /* > * We have handle reserved so we are allowed to join T_LOCKED > * transaction and we don't have to check for transaction size > - * and journal space. > + * and journal space. But we still have to wait while running > + * transaction is being switched to a committing one as it > + * won't wait for any handles anymore. > */ > + if (transaction->t_state == T_SWITCH) { > + wait_transaction_switching(journal); > + goto repeat; > + } > sub_reserved_credits(journal, blocks); > handle->h_reserved = 0; > } > diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h > index b708e5169d1d..118d00a64184 100644 > --- a/include/linux/jbd2.h > +++ b/include/linux/jbd2.h > @@ -575,6 +575,7 @@ struct transaction_s > enum { > T_RUNNING, > T_LOCKED, > + T_SWITCH, > T_FLUSH, > T_COMMIT, > T_COMMIT_DFLUSH, > -- > 2.16.4 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR