Re: [patch 1/3] mm: vmscan: fix numa reclaim balance problem in kswapd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 22, 2013 at 03:47:03PM -0400, Rik van Riel wrote:
> On 07/19/2013 04:55 PM, Johannes Weiner wrote:
> >When the page allocator fails to get a page from all zones in its
> >given zonelist, it wakes up the per-node kswapds for all zones that
> >are at their low watermark.
> >
> >However, with a system under load and the free page counters being
> >per-cpu approximations, the observed counter value in a zone can
> >fluctuate enough that the allocation fails but the kswapd wakeup is
> >also skipped while the zone is still really close to the low
> >watermark.
> >
> >When one node misses a wakeup like this, it won't be aged before all
> >the other node's zones are down to their low watermarks again.  And
> >skipping a full aging cycle is an obvious fairness problem.
> >
> >Kswapd runs until the high watermarks are restored, so it should also
> >be woken when the high watermarks are not met.  This ages nodes more
> >equally and creates a safety margin for the page counter fluctuation.
> >
> >By using zone_balanced(), it will now check, in addition to the
> >watermark, if compaction requires more order-0 pages to create a
> >higher order page.
> >
> >Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> 
> This patch alone looks like it could have the effect of increasing
> the pressure on the first zone in the zonelist, keeping its free
> memory above the low watermark essentially forever, without having
> the allocator fall back to other zones.
> 
> However, your third patch fixes that problem, and missed wakeups
> would still hurt, so...

The kswapd wakeups happen in the slowpath, after the fastpath tried
all zones in the zonelist, not just the first one.

With the problem fixed in #3, the slowpath is rarely entered (even
when kswapds should be woken).  From that point of view, the effects
of #1 are further improved by #3, but #1 on its own does not worsen
the situation.

> Reviewed-by: Rik van Riel <riel@xxxxxxxxxx>

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]