Re: [PATCH] do_migrate_range: avoid failure as much as possible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 25, 2010 at 12:34:48PM +0800, KAMEZAWA Hiroyuki wrote:
> On Mon, 25 Oct 2010 12:06:04 +0800
> Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote:
> 
> > On Mon, Oct 25, 2010 at 11:48:16AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 25 Oct 2010 11:48:33 +0800
> > > Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote:
> > > 
> > > > On Mon, Oct 25, 2010 at 11:09:01AM +0800, KAMEZAWA Hiroyuki wrote:
> > > > > On Mon, 25 Oct 2010 12:05:50 +0900
> > > > > KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> > > > > 
> > > > > > This changes behavior.
> > > > > > 
> > > > > > This "ret" can be > 0 because migrate_page()'s return code is
> > > > > > "Return: Number of pages not migrated or error code."
> > > > > > 
> > > > > > Then, 
> > > > > > ret < 0  ===> maybe ebusy
> > > > > > ret > 0  ===> some pages are not migrated. maybe PG_writeback or some
> > > > > > ret == 0 ===> ok, all condition green. try next chunk soon.
> > > > > > 
> > > > > > Then, I added "yield()" and --retrym_max for !ret cases.
> > > > >                                                ^^^^^^^^
> > > > > 						wrong.
> > > > > 
> > > > > The code here does
> > > > > 
> > > > > ret == 0 ==> ok, all condition green, try next chunk.
> > > > 
> > > > It seems reasonable to remove the drain operations for "ret == 0"
> > > > case.  That would help large NUMA boxes noticeably I guess.
> > > > 
> > > Maybe.

OK, I'll post a patch for it.

> > > > > ret > 0  ==> all pages are isolated but some pages cannot be migrated. maybe under I/O
> > > > > 	     do yield.
> > > > 
> > > > Don't know how to deal with the possible "migration fail" pages --
> > > > sorry I have no idea about that situation at all.
> > > > 
> > > 
> > > In typical case, page_count() > 0 by get_user_pages() or PG_writeback is set.
> > > All we can do is just waiting.
> > 
> > OK.
> > 
> > > > Perhaps, OOM while offlining pages?
> > > > 
> > > 
> > > I never see that..because memory offline is scheduled to be done only when
> > > there are free memory.
> > 
> > OK.
> > 
> > On OOM migrate_page() will return -ENOMEM, which will be handled in
> > the "ret < 0" case. So it will give up after some retries.
> > 
> > migrate_page() has a comment /* Permanent failure */ when returning
> > positive ret. So it looks safer not to retry indefinitely on the
> > "ret > 0" case?
> > 
> > Then it's reduced to two cases: "ret != 0, cannot make smooth
> > progress, unconditional retries may livelock" and "ret ==0, makes some
> > progress, safe to retry".
> > 
> Memory offline is designed to be able to stop by Ctrl-C. And it has timeout
> of 120 sec.
> 
> I don't called as livelock.

Ah sorry for overlooking that!  I should really think twice..(after
thinking twice) I find it's even better. Unmigratible pages will be
put back to LRU. Then -EBUSY will be returned when trying to isolate
it the next time. So it's an imaginary problem.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]