On Wed, 10 Sep 2014 15:48:43 +0200 Michal Hocko <mhocko@xxxxxxx> wrote: > On Tue 09-09-14 12:33:46, Neil Brown wrote: > > On Thu, 4 Sep 2014 15:54:27 +0200 Michal Hocko <mhocko@xxxxxxx> wrote: > > > > > [Sorry for jumping in so late - I've been busy last days] > > > > > > On Wed 27-08-14 16:36:44, Mel Gorman wrote: > > > > On Tue, Aug 26, 2014 at 08:00:20PM -0400, Trond Myklebust wrote: > > > > > On Tue, Aug 26, 2014 at 7:51 PM, Trond Myklebust > > > > > <trond.myklebust@xxxxxxxxxxxxxxx> wrote: > > > > > > On Tue, Aug 26, 2014 at 7:19 PM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > [...] > > > > > >> wait_on_page_writeback() is a hammer, and we need to be better about > > > > > >> this once we have per-memcg dirty writeback and throttling, but I > > > > > >> think that really misses the point. Even if memcg writeback waiting > > > > > >> were smarter, any length of time spent waiting for yourself to make > > > > > >> progress is absurd. We just shouldn't be solving deadlock scenarios > > > > > >> through arbitrary timeouts on one side. If you can't wait for IO to > > > > > >> finish, you shouldn't be passing __GFP_IO. > > > > > > Exactly! > > > > This is overly simplistic. > > The code that cannot wait may be further up the call chain and not in a > > position to avoid passing __GFP_IO. > > In many case it isn't that "you can't wait for IO" in general, but that you > > cannot wait for one specific IO request. > > Could you be more specific, please? Why would a particular IO make any > difference to general IO from the same path? My understanding was that > once the page is marked PG_writeback then it is about to be written to > its destination and if there is any need for memory allocation it should > better not allow IO from reclaim. The more complex the filesystem, the harder it is to "not allow IO from reclaim". For NFS (which started this thread) there might be a need to open a new connection - so allocating in the networking code would all need to be careful. And it isn't impossible that a 'gss' credential needs to be re-negotiated, and that might even need user-space interaction (not sure of details). What you say certainly used to be the case, and very often still is. But it doesn't really scale with complexity of filesystems. I don't think there is (yet) any need to optimised for allocations that don't disallow IO happening in the writeout path. But I do think waiting indefinitely for a particular IO is unjustifiable. > > > wait_on_page_writeback() waits for a specific IO and so is dangerous. > > congestion_wait() or similar waits for IO in general and so is much safer. > > congestion_wait was actually not sufficient to prevent from OOM with > heavy writer in a small memcg. We simply do not know how long will the > IO last so any "wait for a random timeout" will end up causing some > troubles. I certainly accept that "congestion_wait" isn't a sufficient solution. The thing I like about it is that it combines a timeout with a measure of activity. As long as writebacks are completing, it is reasonable to wait_on_page_writeback(). But if no writebacks have completed for a while, then it seems pointless waiting on this page any more. Best to try to make forward progress with whatever memory you can find. NeilBrown
Attachment:
signature.asc
Description: PGP signature