[PATCH 16/35] writeback: increase min pause time on concurrent dirtiers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Target for >60ms pause time when there are 100+ heavy dirtiers per bdi.
(will average around 100ms given 200ms max pause time)

It's OK for 1 dd task doing 100MB/s to be throttle paused 100 times per
second.  However when there are 100 tasks writing to the same disk,
That sums up to 100*100 balance_dirty_pages() calls per second and may
lead to massive cacheline bouncing on accessing the global page states
in NUMA machines.  Even in single socket boxes, we easily see >10% CPU
time reduction by increasing the pause time.

CC: Dave Chinner <david@xxxxxxxxxxxxx>
Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
---
 mm/page-writeback.c |   23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/page-writeback.c	2010-12-13 21:46:16.000000000 +0800
+++ linux-next/mm/page-writeback.c	2010-12-13 21:46:16.000000000 +0800
@@ -659,6 +659,27 @@ static unsigned long max_pause(unsigned 
 }
 
 /*
+ * Scale up pause time for concurrent dirtiers in order to reduce CPU overheads.
+ * But ensure reasonably large [min_pause, max_pause] range size, so that
+ * nr_dirtied_pause (and hence future pause time) can stay reasonably stable.
+ */
+static unsigned long min_pause(struct backing_dev_info *bdi,
+			       unsigned long max)
+{
+	unsigned long hi = ilog2(bdi->write_bandwidth);
+	unsigned long lo = ilog2(bdi->throttle_bandwidth);
+	unsigned long t;
+
+	if (lo >= hi)
+		return 1;
+
+	/* (N * 10ms) on 2^N concurrent tasks */
+	t = (hi - lo) * (10 * HZ) / 1024;
+
+	return clamp_val(t, 1, max / 2);
+}
+
+/*
  * balance_dirty_pages() must be called by processes which are generating dirty
  * data.  It looks at the number of dirty pages in the machine and will force
  * the caller to perform writeback if the system is over `vm_dirty_ratio'.
@@ -798,7 +819,7 @@ pause:
 
 	if (pause == 0 && nr_dirty < background_thresh)
 		current->nr_dirtied_pause = ratelimit_pages(bdi);
-	else if (pause == 1)
+	else if (pause <= min_pause(bdi, pause_max))
 		current->nr_dirtied_pause += current->nr_dirtied_pause / 32 + 1;
 	else if (pause >= pause_max)
 		/*


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux