Hello, On Mon 14-10-13 00:39:52, Ming Lei wrote: > Commit 4e7ea81db5(ext4: restructure writeback path) introduces > another performance regression on random write: > > - one more page may be mapped to ext4 extent in mpage_prepare_extent_to_map, > and will be submitted for I/O so nr_to_write will become -1 before 'done' > is set > > - the worse thing is that dirty pages may still be retrieved from page > cache after nr_to_write becomes negative, so lots of small chunks can be > submitted to block device when page writeback is catching up with write path, > and performance is hurted. Umm, I guess I see what are you pointing at. Thanks for catching that. mpage_process_page_bufs() always adds a buffer to mpd even if nr_to_write is already <= 0. But I would somewhat prefer not to call mpage_prepare_extent_to_map() at all when nr_to_write <= 0. So a patch like: ret = mpage_prepare_extent_to_map(&mpd); if (!ret) { - if (mpd.map.m_len) + if (mpd.map.m_len) { ret = mpage_map_and_submit_extent(handle, &mpd, &give_up_on_write); - else { + done = (wbc->nr_to_write <= 0); + } else { Should also fix your problem, am I right? Honza > On one arm A15 board(arndale) with sata 3.0 SSD(CPU: 1.5GHz dura core, RAM: 2GB), > this patch can improve below test result from 157MB/sec to 174MB/sec(>10%): > > dd if=/dev/zero of=./z.img bs=8K count=512K > > The above test is actually prototype of block write in bonnie++ utility. > > This patch fixes check on nr_to_write in mpage_prepare_extent_to_map() > to make sure nr_to_write won't become negative. > > Cc: Ted Tso <tytso@xxxxxxx> > Cc: Jan Kara <jack@xxxxxxx> > Cc: linux-ext4@xxxxxxxxxxxxxxx > Cc: "linux-fsdevel@xxxxxxxxxxxxxxx" <linux-fsdevel@xxxxxxxxxxxxxxx> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxxxxx> > --- > fs/ext4/inode.c | 20 ++++++++++---------- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 32c04ab..6a62803 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -2356,15 +2356,6 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) > if (mpd->map.m_len == 0) > mpd->first_page = page->index; > mpd->next_page = page->index + 1; > - /* Add all dirty buffers to mpd */ > - lblk = ((ext4_lblk_t)page->index) << > - (PAGE_CACHE_SHIFT - blkbits); > - head = page_buffers(page); > - err = mpage_process_page_bufs(mpd, head, head, lblk); > - if (err <= 0) > - goto out; > - err = 0; > - > /* > * Accumulated enough dirty pages? This doesn't apply > * to WB_SYNC_ALL mode. For integrity sync we have to > @@ -2374,9 +2365,18 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) > * of the old dirty pages. > */ > if (mpd->wbc->sync_mode == WB_SYNC_NONE && > - mpd->next_page - mpd->first_page >= > + mpd->next_page - mpd->first_page > > mpd->wbc->nr_to_write) > goto out; > + > + /* Add all dirty buffers to mpd */ > + lblk = ((ext4_lblk_t)page->index) << > + (PAGE_CACHE_SHIFT - blkbits); > + head = page_buffers(page); > + err = mpage_process_page_bufs(mpd, head, head, lblk); > + if (err <= 0) > + goto out; > + err = 0; > } > pagevec_release(&pvec); > cond_resched(); > -- > 1.7.9.5 > -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html