On Wed, 26 Sep 2018, Kirill A. Shutemov wrote: > On Tue, Sep 25, 2018 at 02:03:26PM +0200, Michal Hocko wrote: > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index c3bc7e9c9a2a..c0bcede31930 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -629,21 +629,40 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, > > * available > > * never: never stall for any thp allocation > > */ > > -static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma) > > +static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma, unsigned long addr) > > { > > const bool vma_madvised = !!(vma->vm_flags & VM_HUGEPAGE); > > + gfp_t this_node = 0; > > + > > +#ifdef CONFIG_NUMA > > + struct mempolicy *pol; > > + /* > > + * __GFP_THISNODE is used only when __GFP_DIRECT_RECLAIM is not > > + * specified, to express a general desire to stay on the current > > + * node for optimistic allocation attempts. If the defrag mode > > + * and/or madvise hint requires the direct reclaim then we prefer > > + * to fallback to other node rather than node reclaim because that > > + * can lead to excessive reclaim even though there is free memory > > + * on other nodes. We expect that NUMA preferences are specified > > + * by memory policies. > > + */ > > + pol = get_vma_policy(vma, addr); > > + if (pol->mode != MPOL_BIND) > > + this_node = __GFP_THISNODE; > > + mpol_cond_put(pol); > > +#endif > > I'm not very good with NUMA policies. Could you explain in more details how > the code above is equivalent to the code below? > It breaks mbind() because new_page() is now using numa_node_id() to allocate migration targets for instead of using the mempolicy. I'm not sure that this patch was tested for mbind().