The patch titled Subject: ocfs2: improve recovery performance has been added to the -mm tree. Its filename is ocfs2-improve-recovery-performance.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-improve-recovery-performance.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-improve-recovery-performance.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Junxiao Bi <junxiao.bi@xxxxxxxxxx> Subject: ocfs2: improve recovery performance Journal replay will be run when performing recovery for a dead node. To avoid the stale cache impact, all blocks of dead node's journal inode were reloaded from disk. This hurts the performance. Check whether one block is cached before reloading it can improve performance a lot. In my test env, the time doing recovery was improved from 120s to 1s. Link: http://lkml.kernel.org/r/1466155682-24656-1-git-send-email-junxiao.bi@xxxxxxxxxx Signed-off-by: Junxiao Bi <junxiao.bi@xxxxxxxxxx> Reviewed-by: Joseph Qi <joseph.qi@xxxxxxxxxx> Cc: "Gang He" <ghe@xxxxxxxx> Cc: Mark Fasheh <mfasheh@xxxxxxx> Cc: Joel Becker <jlbec@xxxxxxxxxxxx> Cc: Joseph Qi <joseph.qi@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- fs/ocfs2/journal.c | 43 +++++++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 20 deletions(-) diff -puN fs/ocfs2/journal.c~ocfs2-improve-recovery-performance fs/ocfs2/journal.c --- a/fs/ocfs2/journal.c~ocfs2-improve-recovery-performance +++ a/fs/ocfs2/journal.c @@ -1159,10 +1159,8 @@ static int ocfs2_force_read_journal(stru int status = 0; int i; u64 v_blkno, p_blkno, p_blocks, num_blocks; -#define CONCURRENT_JOURNAL_FILL 32ULL - struct buffer_head *bhs[CONCURRENT_JOURNAL_FILL]; - - memset(bhs, 0, sizeof(struct buffer_head *) * CONCURRENT_JOURNAL_FILL); + struct buffer_head *bh = NULL; + struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); num_blocks = ocfs2_blocks_for_bytes(inode->i_sb, i_size_read(inode)); v_blkno = 0; @@ -1174,29 +1172,34 @@ static int ocfs2_force_read_journal(stru goto bail; } - if (p_blocks > CONCURRENT_JOURNAL_FILL) - p_blocks = CONCURRENT_JOURNAL_FILL; - - /* We are reading journal data which should not - * be put in the uptodate cache */ - status = ocfs2_read_blocks_sync(OCFS2_SB(inode->i_sb), - p_blkno, p_blocks, bhs); - if (status < 0) { - mlog_errno(status); - goto bail; - } + for (i = 0; i < p_blocks; i++) { + bh = __find_get_block(osb->sb->s_bdev, p_blkno, + osb->sb->s_blocksize); + /* block not cached. */ + if (!bh) { + p_blkno++; + continue; + } + + brelse(bh); + bh = NULL; + /* We are reading journal data which should not + * be put in the uptodate cache. + */ + status = ocfs2_read_blocks_sync(osb, p_blkno, 1, &bh); + if (status < 0) { + mlog_errno(status); + goto bail; + } - for(i = 0; i < p_blocks; i++) { - brelse(bhs[i]); - bhs[i] = NULL; + brelse(bh); + bh = NULL; } v_blkno += p_blocks; } bail: - for(i = 0; i < CONCURRENT_JOURNAL_FILL; i++) - brelse(bhs[i]); return status; } _ Patches currently in -mm which might be from junxiao.bi@xxxxxxxxxx are ocfs2-improve-recovery-performance.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html