Re: [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 12 Sep 2018, Michal Hocko wrote:

> > Saying that we really want THP isn't an all-or-nothing decision.  We 
> > certainly want to try hard to fault hugepages locally especially at task 
> > startup when remapping our .text segment to thp, and MADV_HUGEPAGE works 
> > very well for that.  Remote hugepages would be a regression that we now 
> > have no way to avoid because the kernel doesn't provide for it, if we were 
> > to remove __GFP_THISNODE that this patch introduces.
> 
> Why cannot you use mempolicy to bind to local nodes if you really care
> about the locality?
> 

Because we do not want to oom kill, we want to fallback first to local 
native pages and then to remote native pages.  That's the order of least 
to greatest latency, we do not want to work hard to allocate a remote 
hugepage when a local native page is faster.  This seems pretty straight 
forward.

> From what you have said so far it sounds like you would like to have
> something like the zone/node reclaim mode fine grained for a specific
> mapping. If we really want to support something like that then it should
> be a generic policy rather than THP specific thing IMHO.
> 
> As I've said it is hard to come up with a solution that would satisfy
> everybody but considering that the existing reports are seeing this a
> regression and cosindering their NUMA requirements are not so strict as
> yours I would tend to think that stronger NUMA requirements should be
> expressed explicitly rather than implicit effect of a madvise flag. We
> do have APIs for that.

Every process on every platform we have would need to define this explicit 
mempolicy for users of libraries that remap text segments because changing 
the allocation behavior of thp out from under them would cause very 
noticeable performance regressions.  I don't know of any platform where 
remote hugepages is preferred over local native pages.  If they exist, it 
sounds resaonable to introduce a stronger variant of MADV_HUGEPAGE that 
defines exactly what you want rather than causing it to become a dumping 
ground and userspace regressions.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux