On Thu, 2010-12-02 at 08:54 +0800, Minchan Kim wrote: > On Thu, Dec 2, 2010 at 9:29 AM, Shaohua Li <shaohua.li@xxxxxxxxx> wrote: > > On Wed, 2010-12-01 at 23:58 +0800, Minchan Kim wrote: > >> On Wed, Dec 01, 2010 at 06:44:27PM +0900, KOSAKI Motohiro wrote: > >> > > T0: Task1 wakeup_kswapd(order=3) > >> > > T1: kswapd enters balance_pgdat > >> > > T2: Task2 wakeup_kswapd(order=2), because pages reclaimed by kswapd are used > >> > > quickly > >> > > T3: kswapd exits balance_pgdat. kswapd will do check. Now new order=2, > >> > > pgdat->kswapd_max_order will become 0, but order=3, if sleeping_prematurely, > >> > > then order will become pgdat->kswapd_max_order(0), while at this time the > >> > > order should 2 > >> > > This isn't a big deal, but we do have a small window the order is wrong. > >> > > > >> > > Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx> > >> > > > >> > > diff --git a/mm/vmscan.c b/mm/vmscan.c > >> > > index d31d7ce..15cd0d2 100644 > >> > > --- a/mm/vmscan.c > >> > > +++ b/mm/vmscan.c > >> > > @@ -2450,7 +2450,7 @@ static int kswapd(void *p) > >> > > } > >> > > } > >> > > > >> > > - order = pgdat->kswapd_max_order; > >> > > + order = max_t(unsigned long, new_order, pgdat->kswapd_max_order); > >> > > } > >> > > finish_wait(&pgdat->kswapd_wait, &wait); > >> > > >> > Good catch! > >> > > >> > But unfortunatelly, the code is not correct. At least, don't fit corrent > >> > design. > >> > > >> > 1) if "order < new_order" condition is false, we already decided to don't > >> > use new_order. So, we shouldn't use new_order after kswapd_try_to_sleep() > >> > 2) if sleeping_prematurely() return false, it probably mean > >> > zone_watermark_ok_safe(zone, order, high_wmark) return false. > >> > therefore, we have to retry reclaim by using old 'order' parameter. > >> > >> Good catch, too. > >> > >> In Shaohua's scenario, if Task1 gets the order-3 page after kswapd's reclaiming, > >> it's no problem. > >> But if Task1 doesn't get the order-3 page and others used the order-3 page for Task1, > >> Kswapd have to reclaim order-3 for Task1, again. > > why? it's just a possibility. Task1 might get its pages too. If Task1 > > doesn't get its pages, it will wakeup kswapd too with its order. > > > >> In addtion, new order is always less than old order in that context. > >> so big order page reclaim makes much safe for low order pages. > > big order page reclaim makes we have more chances to reclaim useful > > pages by lumpy, why it's safe? > > For example, It assume tat Task1 continues to fail get the order-3 > page of GFP_ATOMIC since other tasks continues to allocate order-2 > pages so that they steal pages. but even you reclaim order-3, you can't guarantee task1 can get the pages too. order-3 page can be steal by order-2 allocation > Then, your patch makes continue to > reclaim order-2 page in this scenario. Task1 never get the order-3 > pages if it doesn't have a merge luck. Task1 will wakeup kswapd again for order-3, so kswapd will reclaim order-3 very soon after the order-2 reclaim. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>