On Tue, 26 Aug 2014 19:19:38 -0400 Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > On Tue, Aug 26, 2014 at 02:26:24PM +0100, Mel Gorman wrote: > > It'd be nice of the memcg people could comment on whether they plan to > > handle the fact that memcg is the only called of wait_on_page_writeback > > in direct reclaim paths. > > wait_on_page_writeback() is a hammer, and we need to be better about > this once we have per-memcg dirty writeback and throttling, but I > think that really misses the point. Even if memcg writeback waiting > were smarter, any length of time spent waiting for yourself to make > progress is absurd. We just shouldn't be solving deadlock scenarios > through arbitrary timeouts on one side. If you can't wait for IO to > finish, you shouldn't be passing __GFP_IO. I think that is overly simplistic. Certainly "waiting for yourself" is absurd, but it can be hard to know if that is what you are doing. Refusing to wait at all just because you might be waiting for yourself is also absurd. Direct reclaim already has "congestion_wait()" calls which wait a little while, just in case. Doing that you find a page in writeback might not be such a bad thing. When this becomes an issue, writeout is already slowing everything down, and maybe slowing down a bit more isn't much cost. > > Can't you use mempools like the other IO paths? mempools and other pre-allocation strategies are appropriate for block devices and critical for any "swap out" path. Filesystems have traditionally got by without them, using GFP_NOFS when necessary. GFP_NOFS was originally meant to be set when holding filesystem-internal locks. Setting it everywhere that memory might be allocated while handing write-out is a very different use-case. Setting GFP_NOFS in more and more places doesn't really scale very well and is particularly awkward for NFS as lots of network interfaces don't allow setting GFP flags, and the network maintainers really don't want them to. The recent direct-reclaim changes to get kswapd and the flush- threads to do most of the work made it much easier to avoid deadlocks. Direct reclaim no longer calls ->writepage and doesn't wait_on_page_writeback(). Except when handling memory pressure for a memcg. It's not an easy problem, but I don't think that "use mempools" is a valid answer. A simple rule like "direct reclaim never blocks indefinitely" is, I think, quite achievable and would resolve a whole class of deadlocks. Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature