Re: [patch 3/8] per backing_dev dirty and writeback page accounting

Miklos Szeredi <miklos@xxxxxxxxxx> · Wed, 14 Mar 2007 23:09:06 +0100

> Only if the queue depth is not bound. Queue depths are bound and so
> the distance we can go over the threshold is limited.  This is the
> fundamental principle on which the throttling is based.....
> 
> Hence, if the queue is not full, then we will have either written
> dirty pages to it (i.e wbc->nr_write != write_chunk so we will throttle
> or continue normally if write_chunk was written) or we have no more
> dirty pages left.
> 
> Having no dirty pages left on the bdi and it not being congested
> means we effectively have a clean, idle bdi. We should not be trying
> to throttle writeback here - we can't do anything to improve the
> situation by continuing to try to do writeback on this bdi, so we
> may as well give up and let the writer continue. Once we have dirty
> pages on the bdi, we'll get throttled appropriately.

OK, you convinced me.

How about this patch?  I introduced a new wbc counter, that sums the
number of dirty pages encountered, including ones already under
writeback.

Dave, big thanks for your insights.

Miklos

Index: linux/include/linux/writeback.h
===================================================================

--- linux.orig/include/linux/writeback.h	2007-03-14 22:43:42.000000000 +0100
+++ linux/include/linux/writeback.h	2007-03-14 22:58:56.000000000 +0100
@@ -44,6 +44,7 @@ struct writeback_control {
 	long nr_to_write;		/* Write this many pages, and decrement
 					   this for each page written */
 	long pages_skipped;		/* Pages which were not written */
+	long nr_dirty;			/* Number of dirty pages encountered */
 
 	/*
 	 * For a_ops->writepages(): is start or end are non-zero then this is
Index: linux/mm/page-writeback.c
===================================================================
--- linux.orig/mm/page-writeback.c	2007-03-14 22:41:01.000000000 +0100
+++ linux/mm/page-writeback.c	2007-03-14 23:00:20.000000000 +0100
@@ -220,6 +220,17 @@ static void balance_dirty_pages(struct a
 			pages_written += write_chunk - wbc.nr_to_write;
 			if (pages_written >= write_chunk)
 				break;		/* We've done our duty */
+
+			/*
+			 * If just a few dirty pages were encountered, and
+			 * the queue is not congested, then allow this dirty
+			 * producer to continue.  This resolves the deadlock
+			 * that happens when one filesystem writes back data
+			 * through another.  It should also help when a slow
+			 * device is completely blocking other writes.
+			 */
+			if (wbc.nr_dirty < 8 && !bdi_write_congested(bdi))
+				break;
 		}
 		congestion_wait(WRITE, HZ/10);
 	}
@@ -612,6 +623,7 @@ retry:
 					      min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1))) {
 		unsigned i;
 
+		wbc->nr_dirty += nr_pages;
 		scanned = 1;
 		for (i = 0; i < nr_pages; i++) {
 			struct page *page = pvec.pages[i];
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html