Hello, On (01/11/17 18:38), Michal Hocko wrote: > On Thu 12-01-17 01:16:11, Cheng-yu Lee wrote: > > Hi community, > > > > I have a x86_64 Chromebook running 3.14 kernel with 8G of memory. Using > > Do you see the same with the current Linus tree? > > > zram with swap size set to ~12GB. When in low memory, kswapd is awaken to > > reclaim pages, but under some circumstances the kernel can not find pages > > to reclaim while I'm sure there're still plenty of memory which could be > > reclaimed from background processes (For example, I run some C programs > > which just malloc() lots of memory and get suspended in the background. > > There's no reason they could't be swapped). The consequence is that most of > > CPU time is spent on page reclamation. The system hangs or becomes very > > laggy for a long period. Sometimes it even triggers a kernel panic by the > > hung task detector like: > > <0>[46246.676366] Kernel panic - not syncing: hung_task: blocked tasks > > > > I've added kernel message to trace the problem. I found shrink_inactive_list() > > can barely find any page to reclaim. More precisely, when the problem > > happens, lots of page have _count > 2 in __remove_mapping(). So the > > condition at line 662 of vmscan.c holds: > > http://lxr.free-electrons.com/source/mm/vmscan.c#L662 > > Thus the kernel fails to reclaim those pages at line 1209 > > http://lxr.free-electrons.com/source/mm/vmscan.c#L1209 > > I assume that you are talking about the anonymous LRU hm. as a side note, I think this is not the first time I see "kswapd consumes 100% cpu" report. https://bugzilla.kernel.org/show_bug.cgi?id=65201#c50 http://lkml.iu.edu//hypermail/linux/kernel/1601.2/03564.html https://marc.info/?l=linux-mm&m=145442159521487 https://marc.info/?l=linux-mm&m=145443027124595 -ss > > It's weird that the inactive anonymous list is huge (several GB), but > > nothing can really be freed. So I did some hack to see if moving more pages > > from the active list helps. I commented out the "inactive_list_is_low()" > > checking at line 2420 > > in shrink_node_memcg() so shrink_active_list() is always called. > > http://lxr.free-electrons.com/source/mm/vmscan.c#L2420 > > It turns out that the hack helps. If moving more pages from the active > > list, kswapd works smoothly. The whole 12G zram can be used up before > > system enters OOM condition. > > > > Any idea why the whole inactive anonymous LRU is occupied by pages which > > can not be freed for la long time (several minutes before system dies) ? > > Are there any parameters I can tune to help the situation ? I've tried > > swappiness but it doesn't help. > > > > An alternative is to patch the kernel to call shrink_active_list() more > > frequently when it finds there's nothing that can be reclaimed . But I am > > not sure if it's the right direction. Also it's not so trivial to figure > > out where to add the call. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>