Re: [PATCH] mempolicy:add GFP_THISNODE when allocing new page

Bob Liu <lliubbo@xxxxxxxxx> · Fri, 16 Apr 2010 23:03:02 +0800

On Fri, Apr 16, 2010 at 7:15 PM, Mel Gorman <mel@xxxxxxxxx> wrote:
> On Tue, Apr 13, 2010 at 10:28:35PM +0800, Bob Liu wrote:
>> On Tue, Apr 13, 2010 at 4:38 PM, Mel Gorman <mel@xxxxxxxxx> wrote:
>> > On Tue, Apr 13, 2010 at 04:20:53PM +0800, Bob Liu wrote:
>> >> On 4/6/10, Bob Liu <lliubbo@xxxxxxxxx> wrote:
>> >> > In funtion migrate_pages(), if the dest node have no
>> >> > enough free pages,it will fallback to other nodes.
>> >> > Add GFP_THISNODE to avoid this, the same as what
>> >> > funtion new_page_node() do in migrate.c.
>> >> >
>> >> > Signed-off-by: Bob Liu <lliubbo@xxxxxxxxx>
>> >> > ---
>> >> >  mm/mempolicy.c |    3 ++-
>> >> >  1 files changed, 2 insertions(+), 1 deletions(-)
>> >> >
>> >> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> >> > index 08f40a2..fc5ddf5 100644
>> >> > --- a/mm/mempolicy.c
>> >> > +++ b/mm/mempolicy.c
>> >> > @@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
>> >> >
>> >> >  static struct page *new_node_page(struct page *page, unsigned long node, int **x)
>> >> >  {
>> >> > -       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
>> >> > +       return alloc_pages_exact_node(node,
>> >> > +                               GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
>> >> >  }
>> >> >
>> >>
>> >> Hi, Minchan and Kame
>> >>      Would you please add ack or review to this thread. It's BUGFIX
>> >> and not change, so i don't resend one.
>> >>
>> >
>> > Sorry for taking so long to get around to this thread. I talked on this
>> > patch already but it's in another thread. Here is what I said there
>> >
>> > ====
>> > This appears to be a valid bug fix.  I agree that the way things are structured
>> > that __GFP_THISNODE should be used in new_node_page(). But maybe a follow-on
>> > patch is also required. The behaviour is now;
>> >
>> > o new_node_page will not return NULL if the target node is empty (fine).
>> > o migrate_pages will translate this into -ENOMEM (fine)
>> > o do_migrate_pages breaks early if it gets -ENOMEM ?
>> >
>> > It's the last part I'd like you to double check. migrate_pages() takes a
>> > nodemask of allowed nodes to migrate to. Rather than sending this down
>> > to the allocator, it iterates over the nodes allowed in the mask. If one
>> > of those nodes is full, it returns -ENOMEM.
>> >
>> > If -ENOMEM is returned from migrate_pages, should it not move to the
>> > next node?
>> > ====
>>
>> Hm.I think early return is ok but not sure about it :)
>>
>> As Christoph said
>> "The intended semantic is the preservation of the relative position of the
>> page to the beginning of the node set."
>> "F.e. if you use page
>> migration (or cpuset automigration) to shift an application running on 10
>> nodes up by two nodes to make a hole that would allow you to run another
>> application on the lower nodes. Applications place pages intentionally on
>> certain nodes to be able to manage memory distances."
>>
>> If move to the next node instead of early return, the relative position of the
>> page to the beginning of the node set will be break;
>>
>
> Yeah, but the user requested that a number of nodes to be used. Sure, if the
> first node is not free, a page will be allocated on the second node instead
> but is that not what was requested? If the user wanted one and only one
> node to be used, they wouldn't have specified multiple nodes. I'm not
> convinced an early return is what was intended here.
>

Hmm.
What about this change? If the from_nodes and to_nodes' weight is different,
then we can don't preserv of the relative position of the page to the beginning
of the node set. This case if a page allocation from the dest node
failed, it will
be allocated from the next node instead of early return.

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 08f40a2..094d092 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -842,7 +842,8 @@ static void migrate_page_add(struct page *page,
struct list_head *pagelist,

 static struct page *new_node_page(struct page *page, unsigned long
node, int **x)
 {
-       return alloc_pages_exact_node(node, GFP_HIGHUSER_MOVABLE, 0);
+       return alloc_pages_exact_node(node,
+                       GFP_HIGHUSER_MOVABLE | GFP_THISNODE, 0);
 }

 /*
@@ -945,10 +946,26 @@ int do_migrate_pages(struct mm_struct *mm,

                node_clear(source, tmp);
                err = migrate_to_node(mm, source, dest, flags);
+retry:
                if (err > 0)
                        busy += err;
-               if (err < 0)
+               if (err < 0) {
+                       /*
+                        * If the dest node have no enough memory, and
from_nodes
+                        * to_nodes have no equal weight(don't need
protect offset.)
+                        * try to migrate to next node.
+                        */
+                       if((nodes_weight(*from_nodes) !=
nodes_weight(*to_nodes))
+                               && (err == -ENOMEM)) {
+                               for_each_node_mask(s, *to_nodes) {
+                                       if(s != dest) {
+                                               err =
migrate_to_node(mm, source, s, flags);
+                                               goto retry;
+                                       }
+                               }
+                       }
                        break;
+               }

Thanks !
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href