On Mon, Jul 22, 2013 at 03:47:03PM -0400, Rik van Riel wrote: > On 07/19/2013 04:55 PM, Johannes Weiner wrote: > >When the page allocator fails to get a page from all zones in its > >given zonelist, it wakes up the per-node kswapds for all zones that > >are at their low watermark. > > > >However, with a system under load and the free page counters being > >per-cpu approximations, the observed counter value in a zone can > >fluctuate enough that the allocation fails but the kswapd wakeup is > >also skipped while the zone is still really close to the low > >watermark. > > > >When one node misses a wakeup like this, it won't be aged before all > >the other node's zones are down to their low watermarks again. And > >skipping a full aging cycle is an obvious fairness problem. > > > >Kswapd runs until the high watermarks are restored, so it should also > >be woken when the high watermarks are not met. This ages nodes more > >equally and creates a safety margin for the page counter fluctuation. > > > >By using zone_balanced(), it will now check, in addition to the > >watermark, if compaction requires more order-0 pages to create a > >higher order page. > > > >Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> > > This patch alone looks like it could have the effect of increasing > the pressure on the first zone in the zonelist, keeping its free > memory above the low watermark essentially forever, without having > the allocator fall back to other zones. > > However, your third patch fixes that problem, and missed wakeups > would still hurt, so... The kswapd wakeups happen in the slowpath, after the fastpath tried all zones in the zonelist, not just the first one. With the problem fixed in #3, the slowpath is rarely entered (even when kswapds should be woken). From that point of view, the effects of #1 are further improved by #3, but #1 on its own does not worsen the situation. > Reviewed-by: Rik van Riel <riel@xxxxxxxxxx> Thanks! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>