On 02/19/2015 01:08 AM, Rik van Riel wrote: > On 02/18/2015 06:31 PM, Andrew Morton wrote: >> On Wed, 11 Feb 2015 23:03:55 +0200 Ebru Akagunduz >> <ebru.akagunduz@xxxxxxxxx> wrote: > >>> This patch improves THP collapse rates, by allowing zero pages. >>> >>> Currently THP can collapse 4kB pages into a THP when there are up >>> to khugepaged_max_ptes_none pte_none ptes in a 2MB range. This >>> patch counts pte none and mapped zero pages with the same >>> variable. > >> So if I'm understanding this correctly, with the default value of >> khugepaged_max_ptes_none (HPAGE_PMD_NR-1), if an application >> creates a 2MB area which contains 511 mappings of the zero page and >> one real page, the kernel will proceed to turn that area into a >> real, physical huge page. So it consumes 2MB of memory which would >> not have previously been allocated? > > This is equivalent to an application doing a write fault > to a 2MB area that was previously untouched, going into > do_huge_pmd_anonymous_page() and receiving a 2MB page. > >> If so, this might be rather undesirable behaviour in some >> situations (and ditto the current behaviour for pte_none ptes)? > >> This can be tuned by adjusting khugepaged_max_ptes_none, > > The example of directly going into do_huge_pmd_anonymous_page() > is not influenced by the tunable. > > It may indeed be undesirable in some situations, but I am > not sure how to detect those... Well, yeah. We seem to lack a setting to restrict page fault THP allocations to e.g. madvise, while still letting khugepaged to collapse them later, taking khugepaged_max_ptes_none into account. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>