Re: [PATCH] ext4: fix checking on nr_to_write

Ming Lei <ming.lei@xxxxxxxxxxxxx> · Tue, 15 Oct 2013 19:15:56 +0800

On Tue, 15 Oct 2013 12:39:00 +0200
Jan Kara <jack@xxxxxxx> wrote:

> On Tue 15-10-13 10:25:53, Ming Lei wrote:
> > Looks it makes sense, so how about below change?
> > 
> > --
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 32c04ab..c32b599 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -2294,7 +2294,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
> >  {
> >  	struct address_space *mapping = mpd->inode->i_mapping;
> >  	struct pagevec pvec;
> > -	unsigned int nr_pages;
> > +	unsigned int nr_pages, nr_added = 0;
> >  	pgoff_t index = mpd->first_page;
> >  	pgoff_t end = mpd->last_page;
> >  	int tag;
> > @@ -2330,6 +2330,18 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
> >  			if (page->index > end)
> >  				goto out;
> >  
> > +			/*
> > +			 * Accumulated enough dirty pages? This doesn't apply
> > +			 * to WB_SYNC_ALL mode. For integrity sync we have to
> > +			 * keep going because someone may be concurrently
> > +			 * dirtying pages, and we might have synced a lot of
> > +			 * newly appeared dirty pages, but have not synced all
> > +			 * of the old dirty pages.
> > +			 */
> > +			if (mpd->wbc->sync_mode == WB_SYNC_NONE &&
> > +					nr_added >= mpd->wbc->nr_to_write)
> > +				goto out;
> > +
>   This won't quite work because if the page is fully mapped
> mpage_process_page_bufs() will immediately submit the page and decrease
> nr_to_write. So now you would end up writing less than you were asked for
> in some cases. 

Yes, your are right, so how about below?

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 32c04ab..3cf7abb 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2295,6 +2295,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 	struct address_space *mapping = mpd->inode->i_mapping;
 	struct pagevec pvec;
 	unsigned int nr_pages;
+	int left = mpd->wbc->nr_to_write;
 	pgoff_t index = mpd->first_page;
 	pgoff_t end = mpd->last_page;
 	int tag;
@@ -2330,6 +2331,17 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 			if (page->index > end)
 				goto out;
 
+			/*
+			 * Accumulated enough dirty pages? This doesn't apply
+			 * to WB_SYNC_ALL mode. For integrity sync we have to
+			 * keep going because someone may be concurrently
+			 * dirtying pages, and we might have synced a lot of
+			 * newly appeared dirty pages, but have not synced all
+			 * of the old dirty pages.
+			 */
+			if (mpd->wbc->sync_mode == WB_SYNC_NONE && left <= 0)
+				goto out;
+
 			/* If we can't merge this page, we are done. */
 			if (mpd->map.m_len > 0 && mpd->next_page != page->index)
 				goto out;
@@ -2364,19 +2376,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 			if (err <= 0)
 				goto out;
 			err = 0;
-
-			/*
-			 * Accumulated enough dirty pages? This doesn't apply
-			 * to WB_SYNC_ALL mode. For integrity sync we have to
-			 * keep going because someone may be concurrently
-			 * dirtying pages, and we might have synced a lot of
-			 * newly appeared dirty pages, but have not synced all
-			 * of the old dirty pages.
-			 */
-			if (mpd->wbc->sync_mode == WB_SYNC_NONE &&
-			    mpd->next_page - mpd->first_page >=
-							mpd->wbc->nr_to_write)
-				goto out;
+			left--;
 		}
 		pagevec_release(&pvec);
 		cond_resched();


> Attached patch should do what's needed. Can you try whether
> it fixes the problem for you (it seems to work OK in my testing).

In fact, I had wrote and tested your attached patch before my last post,
and it may trigger BUG() in mpage_release_unused_pages(), that is because
we touch mpd->next_page without locking current page, so it is better to
not increase mpd->next_page if the current page won't be processed.


Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html