On Thu, Sep 29, 2011 at 04:49:57PM +0800, Peter Zijlstra wrote: > On Thu, 2011-09-29 at 11:32 +0800, Wu Fengguang wrote: > > > Now I guess the only problem is when nr_bdi * MIN_WRITEBACK_PAGES ~ > > > limit, at which point things go pear shaped. > > > > Yes. In that case the global @dirty will always be drove up to @limit. > > Once @dirty dropped reasonably below, whichever bdi task wakeup first > > will take the chance to fill the gap, which is not fair for bdi's of > > different speed. > > > > Let me retry the thresh=1M,10M test cases without MIN_WRITEBACK_PAGES. > > Hopefully the removal of it won't impact performance a lot. > > > Right, so alternatively we could try an argument that this is > sufficiently rare and shouldn't happen. People with lots of disks tend > to also have lots of memory, etc. Right. > If we do find it happens we can always look at it again. Sure. Now I got the results for single disk thresh=1M,8M,100M cases and find no big differences if removing MIN_WRITEBACK_PAGES: 3.1.0-rc4-bgthresh3+ 3.1.0-rc4-bgthresh4+ ------------------------ ------------------------ 3988742 +1.9% 4063217 thresh=100M/ext4-10dd-4k-8p-4096M-100M:10-X 4758884 +1.5% 4829320 thresh=100M/ext4-1dd-4k-8p-4096M-100M:10-X 4621240 +1.6% 4693525 thresh=100M/ext4-2dd-4k-8p-4096M-100M:10-X 3420717 +0.1% 3423712 thresh=100M/xfs-10dd-4k-8p-4096M-100M:10-X 4361830 +1.4% 4423554 thresh=100M/xfs-1dd-4k-8p-4096M-100M:10-X 3964043 +0.2% 3972057 thresh=100M/xfs-2dd-4k-8p-4096M-100M:10-X 2937926 +0.6% 2956870 thresh=1M/ext4-10dd-4k-8p-4096M-1M:10-X 4472552 -1.9% 4387457 thresh=1M/ext4-1dd-4k-8p-4096M-1M:10-X 4085707 -3.0% 3961155 thresh=1M/ext4-2dd-4k-8p-4096M-1M:10-X 2206897 +2.1% 2253839 thresh=1M/xfs-10dd-4k-8p-4096M-1M:10-X 4207336 -2.1% 4119821 thresh=1M/xfs-1dd-4k-8p-4096M-1M:10-X 3739888 -3.6% 3604315 thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X 3279302 -0.2% 3273310 thresh=8M/ext4-10dd-4k-8p-4096M-8M:10-X 4834878 +1.6% 4912372 thresh=8M/ext4-1dd-4k-8p-4096M-8M:10-X 4511120 -1.7% 4435193 thresh=8M/ext4-2dd-4k-8p-4096M-8M:10-X 2443874 -0.5% 2432188 thresh=8M/xfs-10dd-4k-8p-4096M-8M:10-X 4308416 -0.6% 4283110 thresh=8M/xfs-1dd-4k-8p-4096M-8M:10-X 3739810 +0.6% 3763320 thresh=8M/xfs-2dd-4k-8p-4096M-8M:10-X Or lowering the largest promotion ratio from 128 to 8: 3.1.0-rc4-bgthresh4+ 3.1.0-rc4-bgthresh5+ ------------------------ ------------------------ 4063217 -0.0% 4062022 thresh=100M/ext4-10dd-4k-8p-4096M-100M:10-X 4829320 +1.1% 4882829 thresh=100M/ext4-1dd-4k-8p-4096M-100M:10-X 4693525 +0.1% 4700537 thresh=100M/ext4-2dd-4k-8p-4096M-100M:10-X 3423712 +0.2% 3431603 thresh=100M/xfs-10dd-4k-8p-4096M-100M:10-X 4423554 -0.3% 4408912 thresh=100M/xfs-1dd-4k-8p-4096M-100M:10-X 3972057 -0.1% 3968535 thresh=100M/xfs-2dd-4k-8p-4096M-100M:10-X 2956870 -0.9% 2929605 thresh=1M/ext4-10dd-4k-8p-4096M-1M:10-X 4387457 -0.2% 4378233 thresh=1M/ext4-1dd-4k-8p-4096M-1M:10-X 3961155 -0.5% 3940075 thresh=1M/ext4-2dd-4k-8p-4096M-1M:10-X 2253839 -0.9% 2232976 thresh=1M/xfs-10dd-4k-8p-4096M-1M:10-X 4119821 -2.1% 4031983 thresh=1M/xfs-1dd-4k-8p-4096M-1M:10-X 3604315 -3.1% 3493042 thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X 3273310 -1.1% 3237060 thresh=8M/ext4-10dd-4k-8p-4096M-8M:10-X 4912372 -0.0% 4911287 thresh=8M/ext4-1dd-4k-8p-4096M-8M:10-X 4435193 +0.1% 4441581 thresh=8M/ext4-2dd-4k-8p-4096M-8M:10-X 2432188 +1.1% 2459249 thresh=8M/xfs-10dd-4k-8p-4096M-8M:10-X 4283110 +0.1% 4289456 thresh=8M/xfs-1dd-4k-8p-4096M-8M:10-X 3763320 -0.1% 3758938 thresh=8M/xfs-2dd-4k-8p-4096M-8M:10-X As for the thresh=100M JBOD cases, I don't see much occurrences of promotion ratio > 2. So the simplification should make no difference, too. Thus the finalized code will be: + x_intercept = bdi_thresh / 2; + if (bdi_dirty < x_intercept) { + if (bdi_dirty > x_intercept / 8) { + pos_ratio *= x_intercept; + do_div(pos_ratio, bdi_dirty); + } else + pos_ratio *= 8; + } Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html