Re: [PATCH] mremap: add MREMAP_NOHOLE flag --resend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/26/2015 04:24 AM, Daniel Micay wrote:
It's all well and good to say that you shouldn't do that, but it's the
basis of the design in jemalloc and other zone-based arena allocators.

There's a chosen chunk size and chunks are naturally aligned. An
allocation is either a span of chunks (chunk-aligned) or has metadata
stored in the chunk header. This also means chunks can be assigned to
arenas for a high level of concurrency. Thread caching is then only
necessary for batching operations to amortize the cost of locking rather
than to reduce contention. Per-CPU arenas can be implemented quite well
by using sched_getcpu() to move threads around whenever it detects that
another thread allocated from the arena.

With >= 2M chunks, madvise purging works very well at the chunk level
but there's also fine-grained purging within chunks and it completely
breaks down from THP page faults.

Are you sure it's due to page faults and not khugepaged + high value (such as the default 511) of max_ptes_none? As reported here?

https://bugzilla.kernel.org/show_bug.cgi?id=93111

Once you have faulted in a THP, and then purged part of it and split it, I don't think page faults in the purged part can lead to a new THP collapse, only khugepaged can do that AFAIK. And if you mmap smaller than 2M areas (i.e. your 256K chunks), that should prevent THP page faults on the first fault within the chunk as well.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]