On Wed, 2009-09-09 at 16:23 +0200, Jan Kara wrote: > Well, what I imagined we could do is: > Have a per-bdi variable 'pages_written' - that would reflect the amount of > pages written to the bdi since boot (OK, we'd have to handle overflows but > that's doable). > > There will be a per-bdi variable 'pages_waited'. When a thread should sleep > in balance_dirty_pages() because we are over limits, it kicks writeback thread > and does: > to_wait = max(pages_waited, pages_written) + sync_dirty_pages() (or > whatever number we decide) > pages_waited = to_wait > sleep until pages_written reaches to_wait or we drop below dirty limits. > > That will make sure each thread will sleep until writeback threads have done > their duty for the writing thread. > > If we make sure sleeping threads are properly ordered on the wait queue, > we could always wakeup just the first one and thus avoid the herding > effect. When we drop below dirty limits, we would just wakeup the whole > waitqueue. > > Does this sound reasonable? That seems to go wrong when there's multiple tasks waiting on the same bdi, you'd count each page for 1/n its weight. Suppose pages_written = 1024, and 4 tasks block and compute their to wait as pages_written + 256 = 1280, then we'd release all 4 of them after 256 pages are written, instead of 4*256, which would be pages_written = 2048. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html