RE: Possible deadloop in direct reclaim?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Kosaki
   Do you have the chance to review my change in the function all_unreclaimable()?
@@ -2353,7 +2353,9 @@ static bool all_unreclaimable(struct zonelist *zonelist,
                        continue;
                if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
                        continue;
-               if (!zone->all_unreclaimable)
+               if (zone->all_unreclaimable)
+                       continue;
+               if (zone_reclaimable(zone))
                        return false;
        }
   In my test, it helped to avoid the infinite loop in direct_reclaim path, and I think it should also avoid the kernel hanging up issue you met in the commit: 929bea7c714220.
   In a word, I think neither check the zone->all_unreclaimable nor zone_reclaimable() is enough in the function all_unreclaimable(), so shall we check both to confirm if a zone is all_unreclaimable?

Thanks!

Best Regards
Lisa Du

-----Original Message-----
From: Lisa Du 
Sent: 2013年7月26日 9:11
To: 'KOSAKI Motohiro'
Cc: Christoph Lameter; linux-mm@xxxxxxxxx; Mel Gorman; Bob Liu
Subject: RE: Possible deadloop in direct reclaim?

Dear KOSAKI
   In my test, I didn't set compaction. Maybe compaction is helpful to avoid this issue. I can have try later.
   In my mind CONFIG_COMPACTION is an optional configuration right? 
   If we don't use, and met such an issue, how should we deal with such infinite loop?

   I made a change in all_reclaimable() function, passed overnight tests, please help review, thanks in advance!
@@ -2353,7 +2353,9 @@ static bool all_unreclaimable(struct zonelist *zonelist,
                        continue;
                if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
                        continue;
-               if (!zone->all_unreclaimable)
+               if (zone->all_unreclaimable)
+                       continue;
+               if (zone_reclaimable(zone))
                        return false;
        }

Thanks!

Best Regards
Lisa Du


-----Original Message-----
From: KOSAKI Motohiro [mailto:kosaki.motohiro@xxxxxxxxx] 
Sent: 2013年7月26日 2:19
To: Lisa Du
Cc: Christoph Lameter; linux-mm@xxxxxxxxx; Mel Gorman; Bob Liu
Subject: Re: Possible deadloop in direct reclaim?

On Tue, Jul 23, 2013 at 9:21 PM, Lisa Du <cldu@xxxxxxxxxxx> wrote:
> Dear Christoph
>    Thanks a lot for your comment. When this issue happen I just trigger a kernel panic and got the kdump.
> From the kdump, I got the global variable pg_data_t congit_page_data. From this structure, I can see in normal zone, only order-0's nr_free = 18442, order-1's nr_free = 367, all the other order's nr_free is 0.

Don't you use compaction? Of if use, please get a log by tracepoints.
We need to know why it doesn't work.
?韬{.n???檩jg???a?旃???)钋???骅w+h?璀?y/i?⒏??⒎???Щ??m???)钋???痂?^??觥??ザ?v???O璁?f??i?⒏?




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]