Re: shrink_inactive_list() failed to reclaim pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[CC Minchan and Sergey for the zram part]

On Thu 12-01-17 01:16:11, Cheng-yu Lee wrote:
> Hi community,
> 
> I have a x86_64 Chromebook running 3.14 kernel with 8G of memory. Using

Do you see the same with the current Linus tree?

> zram with swap size set to ~12GB. When in low memory, kswapd is awaken to
> reclaim pages, but under some circumstances the kernel can not find pages
> to reclaim while I'm sure there're still plenty of memory which could be
> reclaimed from background processes (For example, I run some C programs
> which just malloc() lots of memory and get suspended in the background.
> There's no reason they could't be swapped). The consequence is that most of
> CPU time is spent on page reclamation. The system hangs or becomes very
> laggy for a long period. Sometimes it even triggers a kernel panic by the
> hung task detector like:
> <0>[46246.676366] Kernel panic - not syncing: hung_task: blocked tasks
> 
> I've added kernel message to trace the problem. I found shrink_inactive_list()
> can barely find any page to reclaim. More precisely, when the problem
> happens, lots of page have _count > 2 in __remove_mapping(). So the
> condition at line 662 of vmscan.c holds:
> http://lxr.free-electrons.com/source/mm/vmscan.c#L662
> Thus the kernel fails to reclaim those pages at line 1209
> http://lxr.free-electrons.com/source/mm/vmscan.c#L1209

I assume that you are talking about the anonymous LRU

> It's weird that the inactive anonymous list is huge (several GB), but
> nothing can really be freed. So I did some hack to see if moving more pages
> from the active list helps. I commented out the "inactive_list_is_low()"
> checking at line 2420
> in shrink_node_memcg() so shrink_active_list() is always called.
> http://lxr.free-electrons.com/source/mm/vmscan.c#L2420
> It turns out that the hack helps. If moving more pages from the active
> list, kswapd works smoothly. The whole 12G zram can be used up before
> system enters OOM condition.
> 
> Any idea why the whole inactive anonymous LRU is occupied by pages which
> can not be freed for la long time (several minutes before system dies) ?
> Are there any parameters I can tune to help the situation ? I've tried
> swappiness but it doesn't help.
> 
> An alternative is to patch the kernel to call shrink_active_list() more
> frequently when it finds there's nothing that can be reclaimed . But I am
> not sure if it's the right direction. Also it's not so trivial to figure
> out where to add the call.
> 
> Thanks,
> Cheng-Yu

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]