On Fri, Oct 16, 2015 at 8:57 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > OK so here is what I am playing with currently. It is not complete > yet. So this looks like it's going in a reasonable direction. However: > + if (__zone_watermark_ok(zone, order, min_wmark_pages(zone), > + ac->high_zoneidx, alloc_flags, target)) { > + /* Wait for some write requests to complete then retry */ > + wait_iff_congested(zone, BLK_RW_ASYNC, HZ/50); > + goto retry; > + } I still think we should at least spend some time re-thinking that "wait_iff_congested()" thing. We may not actually be congested, but might be unable to write anything out because of our allocation flags (ie not allowed to recurse into the filesystems), so we might be in the situation that we have a lot of dirty pages that we can't directly do anything about. Now, we will have woken kswapd, so something *will* hopefully be done about them eventually, but at no time do we actually really wait for it. We'll just busy-loop. So at a minimum, I think we should yield to kswapd. We do do that "cond_resched()" in wait_iff_congested(), but I'm not entirely convinced that is at all enough to wait for kswapd to *do* something. So before we really decide to see if we should oom, I think we should have at least one forced io_schedule_timeout(), whether we're congested or not. And yes, as Tetsuo Handa said, any kind of short wait might be too short for IO to really complete, but *something* will have completed. Unless we're so far up the creek that we really should just oom. But I suspect we'll have to just try things out and tweak it. This patch looks like a reasonable starting point to me. Tetsuo, mind trying it out and maybe tweaking it a bit for the load you have? Does it seem to improve on your situation? Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>