On Wed, Oct 05, 2011 at 03:52:06AM +0800, Vivek Goyal wrote: > On Mon, Oct 03, 2011 at 09:42:28PM +0800, Wu Fengguang wrote: > > Hi, > > > > This is the minimal IO-less balance_dirty_pages() changes that are expected to > > be regression free (well, except for NFS). > > > > git://github.com/fengguang/linux.git dirty-throttling-v12 > > > > Tests results will be posted in a separate email. > > Looks like we are solving two problems. > > - IO less balance_dirty_pages() > - Throttling based on ratelimit instead of based on number of dirty pages. > > The second piece is the one which has complicated calculations for > calculating the global/bdi rates and logic for stablizing the rates etc. > > IIUC, second piece is primarily needed for better latencies for writers. Well, yes. The bdi->dirty_ratelimit estimation turns out to be the most confusing part of the patchset... Other than the complexities, the algorithm does work pretty well in the tests (except for small memory cases, in which case its estimation accuracy no longer matters). Note that the bdi->dirty_ratelimit thing, even when goes wrong, is very unlikely to cause large regressions. The known regressions mostly originate from the nature of IO-less. > Will it make sense to break down this work in two patch series. First > push IO less balance dirty pages and then all the complicated pieces > of ratelimits. > > ratelimit allowed you to come up with sleep time for the process. Without > that I think you shall have to fall back to what Jan Kar had done, > calculation based on number of pages. If dropping all the smoothness considerations, the minimal implementation would be close to this patch: [PATCH 05/35] writeback: IO-less balance_dirty_pages() http://www.spinics.net/lists/linux-mm/msg12880.html However the experiences were, it may lead to much worse latencies than the vanilla one in JBOD cases. This is because vanilla kernel has the option to break out of the loop when written enough pages, however the IO-less balance_dirty_pages() will just wait until the dirty pages drop below the (rushed high) bdi threshold, which could take long time. Another question is, the IO-less balance_dirty_pages() is basically on every N pages dirtied, sleep for M jiffies In current patchset, we get the desired N with formula N = bdi->dirty_ratelimit / desired_M When dirty_ratelimit is not available, it would be a problem to estimate the adequate N that works well for various workloads. And to avoid regressions, patches 8,9,10,11 (maybe updated form) will still be necessary. And a complete rerun of all the test cases and to fix up any possible new regressions. Overall it may cost too much (if possible at all, considering the two problems listed above) to try out the above steps. The main intention being "whether we can introduce the dirty_ratelimit complexities later". Considering that the complexity itself is not likely causing problems other than lose of smoothness, it looks beneficial to test the ready made code earlier in production environments, rather than to take lots of efforts to strip them out and test new code, only to add them back in some future release. Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html