Re: [PATCH 10/10] mm, page_alloc: Only enforce watermarks for order-0 allocations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 31, 2015 at 03:08:38PM +0900, Joonsoo Kim wrote:
> On Mon, Jul 20, 2015 at 09:00:19AM +0100, Mel Gorman wrote:
> > From: Mel Gorman <mgorman@xxxxxxx>
> > 
> > The primary purpose of watermarks is to ensure that reclaim can always
> > make forward progress in PF_MEMALLOC context (kswapd and direct reclaim).
> > These assume that order-0 allocations are all that is necessary for
> > forward progress.
> > 
> > High-order watermarks serve a different purpose. Kswapd had no high-order
> > awareness before they were introduced (https://lkml.org/lkml/2004/9/5/9).
> > This was particularly important when there were high-order atomic requests.
> > The watermarks both gave kswapd awareness and made a reserve for those
> > atomic requests.
> > 
> > There are two important side-effects of this. The most important is that
> > a non-atomic high-order request can fail even though free pages are available
> > and the order-0 watermarks are ok. The second is that high-order watermark
> > checks are expensive as the free list counts up to the requested order must
> > be examined.
> > 
> > With the introduction of MIGRATE_HIGHATOMIC it is no longer necessary to
> > have high-order watermarks. Kswapd and compaction still need high-order
> > awareness which is handled by checking that at least one suitable high-order
> > page is free.
> 
> I totally agree removing watermark checking for order from
> PAGE_ALLOC_COSTLY_ORDER to MAX_ORDER. It doesn't make sense to
> maintain such high-order freepage that MM don't guarantee allocation
> success. For example, in my system, when there is 1 order-9 freepage,
> allocation request for order-9 fails because watermark check requires
> at least 2 order-9 freepages in order to succeed order-9 allocation.
> 
> But, I think watermark checking with order up to PAGE_ALLOC_COSTLY_ORDER is
> different. If we maintain just 1 high-order freepages, successive
> high-order allocation request that should be success always fall into
> allocation slow-path and go into the direct reclaim/compaction. It enlarges
> many workload's latency. We should prepare at least some number of freepage
> to handle successive high-order allocation request gracefully.
> 
> So, how about following?
> 
> 1) kswapd checks watermark as is up to PAGE_ALLOC_COSTLY_ORDER. It
> guarantees kswapd prepares some number of high-order freepages so
> successive high-order allocation request will be handlded gracefully.
> 2) In case of !kswapd, just check whether appropriate freepage is
> in buddy or not.
> 

If !atomic allocations use the high-order reserves then they'll fragment
similarly to how they get fragmented today. It defeats the purpose of
the reserve. I noted in the leader that embedded platforms may choose to
carry an out-of-ftree patch that makes the reserves a kernel reserve for
high-order pages but that I didn't think it was a good idea for mainline.

Your suggestion implies we have two watermark checks. The fast path
which obeys watermarks in the traditional way. kswapd would use the same
watermark check. The slow path would use the watermark check in this
path. It is quite complex when historically it was expected that a
!atomic high-order allocation request may take a long time. Furthermore,
it's the case that kswapd gives up high-order reclaim requests very
quickly because there were cases where a high-order request would cause
kswapd to continually reclaim when the system was fragmented. I fear
that your suggestion would partially reintroduce the problem in the name
of trying to decrease the latency of a !atomic high-order allocation
request that is expected to be expensive sometimes.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]