[PATCH 4/5] writeback: permit through good bdi even when global dirty exceeded

Wu Fengguang <fengguang.wu@xxxxxxxxx> · Mon, 5 Dec 2011 14:22:17 +0800

On a system with 1 local mount and 1 NFS mount, if the NFS server
becomes not responding when dd to the NFS mount, the NFS dirty pages may
exceed the global dirty limit and _every_ task involving writing will be
blocked. The whole system appears unresponsive.

The workaround is to permit through the bdi's that only has a small
number of dirty pages. The number chosen (bdi_stat_error pages) is not
enough to enable the local disk to run in optimal throughput, however is
enough to make the system responsive on a broken NFS mount. The user can
then kill the dirtiers on the NFS mount and increase the global dirty
limit to bring up the local disk's throughput.

It risks allowing dirty pages to grow much larger than the global dirty
limit when there are 1000+ mounts, however that's very unlikely to happen,
especially in low memory profiles.

Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
---
 mm/page-writeback.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 155efca..17403e3 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1148,6 +1148,19 @@ pause:
 		if (task_ratelimit)
 			break;
 
+		/*
+		 * In the case of an unresponding NFS server and the NFS dirty
+		 * pages exceeds dirty_thresh, give the other good bdi's a pipe
+		 * to go through, so that tasks on them still remain responsive.
+		 *
+		 * In theory 1 page is enough to keep the comsumer-producer
+		 * pipe going: the flusher cleans 1 page => the task dirties 1
+		 * more page. However bdi_dirty has accounting errors.  So use
+		 * the larger and more IO friendly bdi_stat_error.
+		 */
+		if (bdi_dirty <= bdi_stat_error(bdi))
+			break;
+
 		if (fatal_signal_pending(current))
 			break;
 	}
-- 
1.7.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html