On Tue, 15 Oct 2013 12:39:00 +0200 Jan Kara <jack@xxxxxxx> wrote: > On Tue 15-10-13 10:25:53, Ming Lei wrote: > > Looks it makes sense, so how about below change? > > > > -- > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index 32c04ab..c32b599 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -2294,7 +2294,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) > > { > > struct address_space *mapping = mpd->inode->i_mapping; > > struct pagevec pvec; > > - unsigned int nr_pages; > > + unsigned int nr_pages, nr_added = 0; > > pgoff_t index = mpd->first_page; > > pgoff_t end = mpd->last_page; > > int tag; > > @@ -2330,6 +2330,18 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) > > if (page->index > end) > > goto out; > > > > + /* > > + * Accumulated enough dirty pages? This doesn't apply > > + * to WB_SYNC_ALL mode. For integrity sync we have to > > + * keep going because someone may be concurrently > > + * dirtying pages, and we might have synced a lot of > > + * newly appeared dirty pages, but have not synced all > > + * of the old dirty pages. > > + */ > > + if (mpd->wbc->sync_mode == WB_SYNC_NONE && > > + nr_added >= mpd->wbc->nr_to_write) > > + goto out; > > + > This won't quite work because if the page is fully mapped > mpage_process_page_bufs() will immediately submit the page and decrease > nr_to_write. So now you would end up writing less than you were asked for > in some cases. Yes, your are right, so how about below? diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 32c04ab..3cf7abb 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2295,6 +2295,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) struct address_space *mapping = mpd->inode->i_mapping; struct pagevec pvec; unsigned int nr_pages; + int left = mpd->wbc->nr_to_write; pgoff_t index = mpd->first_page; pgoff_t end = mpd->last_page; int tag; @@ -2330,6 +2331,17 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) if (page->index > end) goto out; + /* + * Accumulated enough dirty pages? This doesn't apply + * to WB_SYNC_ALL mode. For integrity sync we have to + * keep going because someone may be concurrently + * dirtying pages, and we might have synced a lot of + * newly appeared dirty pages, but have not synced all + * of the old dirty pages. + */ + if (mpd->wbc->sync_mode == WB_SYNC_NONE && left <= 0) + goto out; + /* If we can't merge this page, we are done. */ if (mpd->map.m_len > 0 && mpd->next_page != page->index) goto out; @@ -2364,19 +2376,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) if (err <= 0) goto out; err = 0; - - /* - * Accumulated enough dirty pages? This doesn't apply - * to WB_SYNC_ALL mode. For integrity sync we have to - * keep going because someone may be concurrently - * dirtying pages, and we might have synced a lot of - * newly appeared dirty pages, but have not synced all - * of the old dirty pages. - */ - if (mpd->wbc->sync_mode == WB_SYNC_NONE && - mpd->next_page - mpd->first_page >= - mpd->wbc->nr_to_write) - goto out; + left--; } pagevec_release(&pvec); cond_resched(); > Attached patch should do what's needed. Can you try whether > it fixes the problem for you (it seems to work OK in my testing). In fact, I had wrote and tested your attached patch before my last post, and it may trigger BUG() in mpage_release_unused_pages(), that is because we touch mpd->next_page without locking current page, so it is better to not increase mpd->next_page if the current page won't be processed. Thanks, -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html