# CCed Andrew and linux-mm On Wed, Mar 04, 2015 at 10:53:27PM -0800, David Rientjes wrote: > On Wed, 4 Mar 2015, Kazutomo Yoshii wrote: > > > I noticed that numa_alloc_onnode() failed to allocate memory on a > > specified node in v4.0-rc1. I added a code to check the return value > > of walk_page_range() in queue_pages_range() so that do_mbind() only > > returns an error number or zero. > > > > I assume this is libnuma-2.0.10? > > > Signed-off-by: Kazutomo Yoshii <kazutomo.yoshii@xxxxxxxxx> > > --- > > mm/mempolicy.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > index 4721046..ea79171 100644 > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -644,6 +644,7 @@ queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end, > > .nmask = nodes, > > .prev = NULL, > > }; > > + int err; > > struct mm_walk queue_pages_walk = { > > .hugetlb_entry = queue_pages_hugetlb, > > .pmd_entry = queue_pages_pte_range, > > @@ -652,7 +653,10 @@ queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end, > > .private = &qp, > > }; > > - return walk_page_range(start, end, &queue_pages_walk); > > + err = walk_page_range(start, end, &queue_pages_walk); > > + if (err < 0) > > + return err; > > + return 0; > > } > > /* > > I'm afraid I don't think this is the right fix, if walk_page_range() > returns a positive value then it should be supplied by one of the > callbacks in the struct mm_walk, which none of these happen to do. I > think this may be a problem with commit 6f4576e3687b ("mempolicy: apply > page table walker on queue_pages_range()"), so let's add Naoya to the > thread. Thank you for reporting/forwarding, Yoshii-san and David. This bug is in the pagewalk's common path, and the following patch should fix it. Thanks, Naoya Horiguchi --- >From 107fa3fb256bddff40a882c90af717af9863aed7 Mon Sep 17 00:00:00 2001 From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Date: Thu, 5 Mar 2015 16:37:37 +0900 Subject: [PATCH] mm: pagewalk: prevent positive return value of walk_page_test() from being passed to callers walk_page_test() is purely pagewalk's internal stuff, and its positive return values are not intended to be passed to the callers of pagewalk. However, in the current code if the last vma in the do-while loop in walk_page_range() happens to return a positive value, it leaks outside walk_page_range(). So the user visible effect is invalid/unexpected return value (according to the reporter, mbind() causes it.) This patch fixes it simply by reinitializing the return value after checked. Another exposed interface, walk_page_vma(), already returns 0 for such cases so no problem. Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()") Reported-by: Kazutomo Yoshii <kazutomo.yoshii@xxxxxxxxx> Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> --- mm/pagewalk.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 75c1f2878519..29f2f8b853ae 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -265,8 +265,15 @@ int walk_page_range(unsigned long start, unsigned long end, vma = vma->vm_next; err = walk_page_test(start, next, walk); - if (err > 0) + if (err > 0) { + /* + * positive return values are purely for + * controlling the pagewalk, so should never + * be passed to the callers. + */ + err = 0; continue; + } if (err < 0) break; } -- 1.9.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href