Re: [PATCH 7/7] mm: compaction: add compaction to zone_reclaim_mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 13, 2013 at 07:23:58AM +0800, Hush Bensen wrote:
> Do you mean your patch done this fair? There is target zone shrink as 
> you mentiond in the vanilla kernel, however, your patch also done target 
> compaction/reclaim, is this fair?

It's still not fair, zone_reclaim_mode cannot be (modulo major rework
at least) as its whole point is to reclaim memory from the local node
indefinitely, even if there's plenty of "free" or "reclaimable" memory
in remote nodes.

But waking kswapd before all nodes are below the low wmark probably
would make it even less fair than it is now, or at least it wouldn't
provide a fariness increase.

The idea of allowing allocations in the min-low wmark range is that
the "low" wmark would be restored soon anyway at the next
zone_reclaim() invocation, and the zone_reclaim will still behave
synchronous (like direct reclaim) without ever waking kswapd,
regardless if we stop at the low or at the min. But if we stop at the
"low" we're more susceptible to parallel allocation jitters as the
jitter-error margin then becomes:

		.nr_to_reclaim = max(nr_pages, SWAP_CLUSTER_MAX),

which is just 1 single high order page in case of (1<<order) >=
SWAP_CLUSTER_MAX. While if we use the "min" wmark after a successful
zone_reclaim(zone) to decide if to allocate from the zone (the one
passed to zone_reclaim, we may have more margin for allocation jitters
in other CPUs of the same node, or interrupts.

So this again is connected to altering the wmark calculation for high
order pages in the previous patch (which also is intended to allow
having more than 1 THP page in the low-min wmark range). We don't need
many, too many is just a waste of CPU, but a few more than 1
significantly improves the NUMA locality on first allocation if all
CPUs in the node are allocating memory at the same time. I also
trimmed down to zero the high order page requirement for the min
wmark, as we don't need to guarantee hugepage availability for
PF_MEMALLOC (which avoids useless compaction work).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]