On 05/06/2015 01:58 PM, Waiman Long wrote:
On 05/06/2015 06:22 AM, Mel Gorman wrote:
On Wed, May 06, 2015 at 08:12:46AM +0100, Mel Gorman wrote:
On Tue, May 05, 2015 at 03:25:49PM -0700, Andrew Morton wrote:
On Tue, 5 May 2015 23:13:29 +0100 Mel Gorman<mgorman@xxxxxxx> wrote:
Alternatively, the page allocator can go off and synchronously
initialize some pageframes itself. Keep doing that until the
allocation attempt succeeds.
That was rejected during review of earlier attempts at this
feature on
the grounds that it impacted allocator fast paths.
eh? Changes are only needed on the allocation-attempt-failed path,
which is slow-path.
We'd have to distinguish between falling back to other zones because
the
high zone is artifically exhausted and normal ALLOC_BATCH
exhaustion. We'd
also have to avoid falling back to remote nodes prematurely. While I
have
not tried an implementation, I expected they would need to be in the
fast
paths unless I used jump labels to get around it. I'm going to try
altering
when we initialise instead so that it happens earlier.
Which looks as follows. Waiman, a test on the 24TB machine would be
appreciated again. This patch should be applied instead of "mm: meminit:
Take into account that large system caches scale linearly with memory"
---8<---
mm: meminit: Finish initialisation of memory before basic setup
Waiman Long reported that 24TB machines hit OOM during basic setup when
struct page initialisation was deferred. One approach is to
initialise memory
on demand but it interferes with page allocator paths. This patch
creates
dedicated threads to initialise memory before basic setup. It then
blocks
on a rw_semaphore until completion as a wait_queue and counter is
overkill.
This may be slower to boot but it's simplier overall and also gets
rid of a
lot of section mangling which existed so kswapd could do the
initialisation.
Signed-off-by: Mel Gorman<mgorman@xxxxxxx>
This patch moves the deferred meminit from kswapd to its own kernel
threads started after smp_init(). However, the hash table allocation
was done earlier than that. It seems like it will still run out of
memory in the 24TB machine that I tested on.
I will certainly try it out, but I doubt it will solve the problem on
its own.
It turns out that the two new patches did work on the 24-TB DragonHawk
without the "mm: meminit: Take into account that large system caches
scale linearly with memory" patch. The bootup time was 357s which was
just a few seconds slower than the other bootup times that I sent you
yesterday.
BTW, do you want to change the following log message as kswapd will no
longer be the one doing deferred meminit?
kswapd 0 initialised 396098436 pages in 6024ms
Cheers,
Longman
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>