Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page

Bob Liu <lliubbo@xxxxxxxxx> · Sat, 17 Apr 2010 21:54:31 +0800

On Fri, Apr 16, 2010 at 11:55 PM, Christoph Lameter
<cl@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, 16 Apr 2010, Bob Liu wrote:
>
>> Hmm.
>> What about this change? If the from_nodes and to_nodes' weight is different,
>> then we can don't preserv of the relative position of the page to the beginning
>> of the node set. This case if a page allocation from the dest node
>> failed, it will
>> be allocated from the next node instead of early return.
>
> Understand what you are doing first. The fallback is already there.
>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 08f40a2..094d092 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page,
>> struct list_head *pagelist,
>>
>>  static struct page *new_node_page(struct page *page, unsigned long
>> node, int **x)
>>  {
>> -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
>> +       return alloc_pages_exact_node(node,
>> +                       GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
>
> You eliminate falling back to the next node?
>
> GFP_THISNODE forces allocation from the node. Without it we will fallback.
>

Yeah, but I think we shouldn't fallback at this case, what we want is
alloc a page
from exactly the dest node during migrate_to_node(dest).So I added
GFP_THISNODE.

And mel concerned that
====
This appears to be a valid bug fix.  I agree that the way things are structured
that __GFP_THISNODE should be used in new_node_page(). But maybe a follow-on
patch is also required. The behaviour is now;

o new_node_page will not return NULL if the target node is empty (fine).
o migrate_pages will translate this into -ENOMEM (fine)
o do_migrate_pages breaks early if it gets -ENOMEM ?

It's the last part I'd like you to double check. migrate_pages() takes a
nodemask of allowed nodes to migrate to. Rather than sending this down
to the allocator, it iterates over the nodes allowed in the mask. If one
of those nodes is full, it returns -ENOMEM.

If -ENOMEM is returned from migrate_pages, should it not move to the
next node?
====

In my opinion, when we want to preserve the relative position of the page to
the beginning of the node set, early return is ok. Else should try to alloc the
new page from the next node(to_nodes).

So I added retry path to allocate new page from next node only when
from_nodes' weight is different from to_nodes', this case the user should
konw the relative position of the page to the beginning of the node set
can be changed.

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 08f40a2..094d092 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page,
struct list_head *pagelist,

 static struct page *new_node_page(struct page *page, unsigned long
node, int **x)
 {
-       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
+       return alloc_pages_exact_node(node,
+                       GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
 }

 /*
@@ -945,10 +946,26 @@ int do_migrate_pages(struct mm_struct *mm,

               node_clear(source, tmp);
               err = migrate_to_node(mm, source, dest, flags);
+retry:
               if (err > 0)
                       busy += err;
-               if (err < 0)
+               if (err < 0) {
+                       /*
+                        * If the dest node have no enough memory, and
from_nodes
+                        * to_nodes have no equal weight(don't need
protect offset.)
+                        * try to migrate to next node.
+                        */
+                       if((nodes_weight(*from_nodes) !=
nodes_weight(*to_nodes))
+                               && (err == -ENOMEM)) {
+                               for_each_node_mask(s, *to_nodes) {
+                                       if(s != dest) {
+                                               err =
migrate_to_node(mm, source, s, flags);
+                                               goto retry;
+                                       }
+                               }
+                       }
                       break;
+               }

Thanks!
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href