On Wed, 18 Mar 2015, Michal Hocko wrote: > memcg currently uses hardcoded GFP_TRANSHUGE gfp flags for all THP > charges. THP allocations, however, might be using different flags > depending on /sys/kernel/mm/transparent_hugepage/{,khugepaged/}defrag > and the current allocation context. > > The primary difference is that defrag configured to "madvise" value will > clear __GFP_WAIT flag from the core gfp mask to make the allocation > lighter for all mappings which are not backed by VM_HUGEPAGE vmas. > If memcg charge path ignores this fact we will get light allocation but > the a potential memcg reclaim would kill the whole point of the > configuration. > > Fix the mismatch by providing the same gfp mask used for the > allocation to the charge functions. This is quite easy for all > paths except for hugepaged kernel thread with !CONFIG_NUMA which is > doing a pre-allocation long before the allocated page is used in > collapse_huge_page via khugepaged_alloc_page. To prevent from cluttering > the whole code path from khugepaged_do_scan we simply return the current > flags as per khugepaged_defrag() value which might have changed since > the preallocation. If somebody changed the value of the knob we would > charge differently but this shouldn't happen often and it is definitely > not critical because it would only lead to a reduced success rate of > one-off THP promotion. > > Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > Signed-off-by: Michal Hocko <mhocko@xxxxxxx> Acked-by: David Rientjes <rientjes@xxxxxxxxxx> I'm slightly surprised that this issue never got reported before. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>