Re: Possible deadloop in direct reclaim?

Minchan Kim <minchan@xxxxxxxxxx> · Thu, 1 Aug 2013 14:43:38 +0900

Hello,

On Mon, Jul 22, 2013 at 09:58:17PM -0700, Lisa Du wrote:
> Dear Sir:
> Currently I met a possible deadloop in direct reclaim. After run plenty of the application, system run into a status that system memory is very fragmentized. Like only order-0 and order-1 memory left.
> Then one process required a order-2 buffer but it enter an endless direct reclaim. From my trace log, I can see this loop already over 200,000 times. Kswapd was first wake up and then go back to sleep as it cannot rebalance this order's memory. But zone->all_unreclaimable remains 1.
> Though direct_reclaim every time returns no pages, but as zone->all_unreclaimable = 1, so it loop again and again. Even when zone->pages_scanned also becomes very large. It will block the process for long time, until some watchdog thread detect this and kill this process. Though it's in __alloc_pages_slowpath, but it's too slow right? Maybe cost over 50 seconds or even more.
> I think it's not as expected right?  Can we also add below check in the function all_unreclaimable() to terminate this loop?
> 
> @@ -2355,6 +2355,8 @@ static bool all_unreclaimable(struct zonelist *zonelist,
>                         continue;
>                 if (!zone->all_unreclaimable)
>                         return false;
> +               if (sc->nr_reclaimed == 0 && !zone_reclaimable(zone))
> +                       return true;
>         }
>          BTW: I'm using kernel3.4, I also try to search in the kernel3.9, didn't see a possible fix for such issue. Or is anyone also met such issue before? Any comment will be welcomed, looking forward to your reply!
> 
> Thanks!

I'd like to ask somethigs.

1. Do you have enabled swap?
2. Do you enable CONFIG_COMPACTION?
3. Could we get your zoneinfo via cat /proc/zoneinfo?
4. If you disabled watchdog thread, you could see OOM sometime
   although it takes very long time?

> 
> Best Regards
> Lisa Du
> 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>