Dear Kosaki Do you have the chance to review my change in the function all_unreclaimable()? @@ -2353,7 +2353,9 @@ static bool all_unreclaimable(struct zonelist *zonelist, continue; if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL)) continue; - if (!zone->all_unreclaimable) + if (zone->all_unreclaimable) + continue; + if (zone_reclaimable(zone)) return false; } In my test, it helped to avoid the infinite loop in direct_reclaim path, and I think it should also avoid the kernel hanging up issue you met in the commit: 929bea7c714220. In a word, I think neither check the zone->all_unreclaimable nor zone_reclaimable() is enough in the function all_unreclaimable(), so shall we check both to confirm if a zone is all_unreclaimable? Thanks! Best Regards Lisa Du -----Original Message----- From: Lisa Du Sent: 2013年7月26日 9:11 To: 'KOSAKI Motohiro' Cc: Christoph Lameter; linux-mm@xxxxxxxxx; Mel Gorman; Bob Liu Subject: RE: Possible deadloop in direct reclaim? Dear KOSAKI In my test, I didn't set compaction. Maybe compaction is helpful to avoid this issue. I can have try later. In my mind CONFIG_COMPACTION is an optional configuration right? If we don't use, and met such an issue, how should we deal with such infinite loop? I made a change in all_reclaimable() function, passed overnight tests, please help review, thanks in advance! @@ -2353,7 +2353,9 @@ static bool all_unreclaimable(struct zonelist *zonelist, continue; if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL)) continue; - if (!zone->all_unreclaimable) + if (zone->all_unreclaimable) + continue; + if (zone_reclaimable(zone)) return false; } Thanks! Best Regards Lisa Du -----Original Message----- From: KOSAKI Motohiro [mailto:kosaki.motohiro@xxxxxxxxx] Sent: 2013年7月26日 2:19 To: Lisa Du Cc: Christoph Lameter; linux-mm@xxxxxxxxx; Mel Gorman; Bob Liu Subject: Re: Possible deadloop in direct reclaim? On Tue, Jul 23, 2013 at 9:21 PM, Lisa Du <cldu@xxxxxxxxxxx> wrote: > Dear Christoph > Thanks a lot for your comment. When this issue happen I just trigger a kernel panic and got the kdump. > From the kdump, I got the global variable pg_data_t congit_page_data. From this structure, I can see in normal zone, only order-0's nr_free = 18442, order-1's nr_free = 367, all the other order's nr_free is 0. Don't you use compaction? Of if use, please get a log by tracepoints. We need to know why it doesn't work. ?韬{.n???檩jg???a?旃???)钋???骅w+h?璀?y/i?⒏??⒎???Щ??m???)钋???痂?^??觥??ザ?v???O璁?f??i?⒏?