On Wed, Aug 10, 2011 at 12:19:32AM +0800, Peter Zijlstra wrote: > On Tue, 2011-08-09 at 18:16 +0200, Peter Zijlstra wrote: > > On Tue, 2011-08-09 at 11:50 -0400, Vivek Goyal wrote: > > > > > > So IIUC, bdi->dirty_ratelimit is the dynmically adjusted desired rate > > > limit (based on postion ratio, dirty_bw and write_bw). But this seems > > > to be overall bdi limit and does not seem to take into account the > > > number of tasks doing IO to that bdi (as your comment suggests). So > > > it probably will track write_bw as opposed to write_bw/N. What am > > > I missing? > > > > I think the per task thing comes from him using the pages_dirtied > > argument to balance_dirty_pages() to compute the sleep time. Although > > I'm not quite sure how he keeps fairness in light of the sleep time > > bounding to MAX_PAUSE. > > Furthermore, there's of course the issue that current->nr_dirtied is > computed over all BDIs it dirtied pages from, and the sleep time is > computed for the BDI it happened to do the overflowing write on. > > Assuming an task (mostly) writes to a single bdi, or equally to all, it > should all work out. Right. That's one pitfall I forgot to mention, sorry. If _really_ necessary, the above imperfection can be avoided by adding tsk->last_dirty_bdi and tsk->to_pause, and to do so when switching to another bdi: to_pause += nr_dirtied / task_ratelimit if (to_pause > reasonable_large_pause_time) { sleep(to_pause) to_pause = 0 } nr_dirtied = 0 Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html