On 24.07.2013 14:46, Hush Bensen wrote:
于 2013/7/24 19:18, Zlatko Calusic 写道:
On 22.07.2013 18:48, Zlatko Calusic wrote:
On 19.07.2013 22:55, Johannes Weiner wrote:
The way the page allocator interacts with kswapd creates aging
imbalances, where the amount of time a userspace page gets in memory
under reclaim pressure is dependent on which zone, which node the
allocator took the page frame from.
#1 fixes missed kswapd wakeups on NUMA systems, which lead to some
nodes falling behind for a full reclaim cycle relative to the other
nodes in the system
#3 fixes an interaction where kswapd and a continuous stream of page
allocations keep the preferred zone of a task between the high and
low watermark (allocations succeed + kswapd does not go to sleep)
indefinitely, completely underutilizing the lower zones and
thrashing on the preferred zone
These patches are the aging fairness part of the thrash-detection
based file LRU balancing. Andrea recommended to submit them
separately as they are bugfixes in their own right.
I have the patch applied and under testing. So far, so good. It looks
like it could finally fix the bug that I was chasing few months ago
(nicely described in your bullet #3). But, few more days of testing will
be needed before I can reach a quality verdict.
Well, only 2 days later it's already obvious that the patch is
perfect! :)
In the attached image, in the left column are the graphs covering last
day and a half. It can be observed that zones are really balanced, and
that aging is practically perfect. Graphs on the right column cover
last 10 day period, and the left side of the upper graph shows how it
would look with the stock kernel after about 20 day uptime (although
only a few days is enough to reach such imbalance). File pages in the
Normal zone are extinct species (red) and the zone is choke full of
anon pages (blue). Having seen a lot of this graphs, I'm certain that
it won't happen anymore with your patch applied. The balance is
restored! Thank you for your work. Feel free to add:
Tested-by: Zlatko Calusic <zcalusic@xxxxxxxxxxx>
Thanks for your testing Zlatko, could you tell me which benchmark or
workload you are using? Btw, which tool is used to draw these nice
pictures? ;-)
Workload is mixed (various services, light load). What makes the biggest
I/O load is backup procedure that goes every evening. The graphs are
home-made, a little bit of rrd, a little bit of perl, nothing too
complex. I'm actually slowly getting rid of these extra graphs, because
I used them only for debugging this specific problem, which is now fixed
thanks to Johannes.
--
Zlatko
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>