On 01/24/2018 04:47 PM, Zi Yan wrote: >>>> With this change, whenever an application issues MADV_DONTNEED on a >>>> memory region, the region is marked as "space-efficient". For such >>>> regions, a hugepage is not immediately allocated on first write. >>> Kirill didn't like it in the previous version and I do not like this >>> either. You are adding a very subtle side effect which might completely >>> unexpected. Consider userspace memory allocator which uses MADV_DONTNEED >>> to free up unused memory. Now you have put it out of THP usage >>> basically. >>> >> Userpsace may want a region to be considered by khugepaged while opting >> out of hugepage allocation on first touch. Asking userspace memory >> allocators to have to track and reclaim unused parts of a THP allocated >> hugepage does not seems right, as the kernel can use simple userspace >> hints to avoid allocating extra memory in the first place. >> >> I agree that this patch is adding a subtle side-effect which may take >> some applications by surprise. However, I often see the opposite too: >> for many workloads, disabling THP is the first advise as this aggressive >> allocation of hugepages on first touch is unexpected and is too >> wasteful. For e.g.: >> >> 1) Disabling THP for TokuDB (Storage engine for MySQL, MariaDB) >> http://www.chriscalender.com/disabling-transparent-hugepages-for-tokudb/ >> >> 2) Disable THP on MongoDB >> https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/ >> >> 3) Disable THP for Couchbase Server >> https://blog.couchbase.com/often-overlooked-linux-os-tweaks/ >> >> 4) Redis >> http://antirez.com/news/84 >> >> >>> If the memory is used really scarce then we have MADV_NOHUGEPAGE. >>> >> It's not really about memory scarcity but a more efficient use of it. >> Applications may want hugepage benefits without requiring any changes to >> app code which is what THP is supposed to provide, while still avoiding >> memory bloat. >> > I read these links and find that there are mainly two complains: > 1. THP causes latency spikes, because direction compaction slows down THP allocation, > 2. THP bloats memory footprint when jemalloc uses MADV_DONTNEED to return memory ranges smaller than > THP size and fails because of THP. > > The first complain is not related to this patch. I'm trying to address many different THP issues and memory bloat is first among them. > For second one, at least with recent kernels, MADV_DONTNEED splits THPs and returns the memory range you > specified in madvise(). Am I missing anything? > Yes, MADV_DONTNEED splits THPs and releases the requested range but this is not solving the issue of aggressive alloc-hugepage-on-first-touch policy of THP=madvise on MADV_HUGEPAGE regions. Sure, some workloads may prefer that policy but for application that don't, this patch give them an option to give hints to the kernel to go for gradual hugepage promotion via khugepaged only (and not on first touch). It's not good if an application has to track which parts of their (implicitly allocated) hugepage are in use and which sub-parts are free so they can issue MADV_DONTNEED calls on them. This approach really does not make THP "transparent" and requires lot of mm tracking code in userpace. Nitin -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>