ext4: improve commit path performance for fast commit This patch series supersedes the patch "ext4: remove journal barrier during fast commit" sent in Feb 2022. This patch series reworks the fast commit's commit path to improve overall performance of the commit path. Following optimizations have been added in this series: * Avoid having to lock the journal throughout the fast commit. * Remove tracking of open handles per inode. With the changes introduced in this patch series, now the commit path for fast commits is as follows: [1] Lock the journal by calling jbd2_journal_lock_updates. This ensures that all the exsiting handles finish and no new handles can start. [2] Mark all the fast commit eligible inodes as undergoing fast commit by setting "EXT4_STATE_FC_COMMITTING" state. [3] Unlock the journal by calling jbd2_journal_unlock_updates. This allows starting of new handles. If new handles try to start an update on any of the inodes that are being committed, ext4_fc_track_inode() will block until those inodes have finished the fast commit. [4] Submit data buffers of all the committing inodes. [5] Wait for [4] to complete. [6] Commit all the directory entry updates in the fast commit space. [7] Commit all the changed inodes in the fast commit space and clear "EXT4_STATE_FC_COMMITTING" for all the inodes. [8] Write tail tag to ensure atomicity of commits. (The above flow has been documented in the code as well) Instead of calling ext4_fc_track_inode() in ext4_journal_start() as I originally proposed on the code review of [PATCH V2 2/5] "ext4: ext4: for committing inode, make ext4_fc_track_inode wait" [1], in this version I changed the behavior of ext4_reserve_inode_write() to also call ext4_fc_track_inode(). Let's call this approach 1. I also evaluated another approach (approach 2) where ext4_reserve_inode_write() acts just as an assertion to ensure that inode is on fast commit list and the actual call to ext4_fc_track_inode() is done by ext4_journal_start(). However, this results in too many stray ext4_fc_track_inode() calls. Approach 1 reduces the number of stray ext4_fc_track_inode() calls and thus makes the code more maintainable. However, approach 1 results in a potential deadlock where the caller can hang if they grab i_data_sem before calling ext4_fc_track_inode(). To handle that, I have added explicit calls to ext4_fc_track_inode() in such places. Eventually, when we migrate to using extent status tree for logical to physical mapping lookup, we can get rid of this ordering requirement and also remove these calls to ext4_fc_track_inode(). But, even after adding these stray calls, the number of stray calls to ext4_fc_track_inode() were less in approach 1 than in approach 2. I verified that the patch series introduces no regressions in "quick" and "log" groups when "fast_commit" feature is enabled. [1] https://www.spinics.net/lists/linux-ext4/msg82019.html Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@xxxxxxxxx> Harshad Shirwadkar (6): ext4: convert i_fc_lock to spinlock ext4: for committing inode, make ext4_fc_track_inode wait ext4: mark inode dirty before grabbing i_data_sem in ext4_setattr ext4: rework fast commit commit path ext4: drop i_fc_updates from inode fc info ext4: update code documentation fs/ext4/ext4.h | 12 +-- fs/ext4/fast_commit.c | 240 +++++++++++++++++++++--------------------- fs/ext4/inline.c | 3 + fs/ext4/inode.c | 10 +- fs/ext4/super.c | 2 +- fs/jbd2/journal.c | 2 - 6 files changed, 136 insertions(+), 133 deletions(-) -- 2.36.0.rc0.470.gd361397f0d-goog