On Fri 17-03-17 22:24:40, Tetsuo Handa wrote: > On 2017/03/17 17:46, Michal Hocko wrote: > > On Thu 16-03-17 03:04:09, Philip J. Freeman wrote: > >> My laptop became almost totally un responsive today. I was able to > >> switch VTs but not log in and had to power cycle to regain control. I > >> don't understand what this means. Any ideas? > >> > >> Mar 14 14:31:20 x61s-44a5 kernel: [168382.032039] DOM Worker: page allocation stalls for 10646ms, order:0, mode:0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO) > > [...] > >> Mar 14 14:31:22 x61s-44a5 kernel: [168382.032181] Mem-Info: > >> Mar 14 14:31:22 x61s-44a5 kernel: [168382.032192] active_anon:308454 inactive_anon:154809 isolated_anon:224 > >> Mar 14 14:31:22 x61s-44a5 kernel: [168382.032192] active_file:869 inactive_file:978 isolated_file:0 > >> Mar 14 14:31:22 x61s-44a5 kernel: [168382.032192] unevictable:0 dirty:0 writeback:0 unstable:0 > >> Mar 14 14:31:22 x61s-44a5 kernel: [168382.032192] slab_reclaimable:6099 slab_unreclaimable:8555 > >> Mar 14 14:31:22 x61s-44a5 kernel: [168382.032192] mapped:1999 shmem:156254 pagetables:2929 bounce:0 > >> Mar 14 14:31:22 x61s-44a5 kernel: [168382.032192] free:13192 free_pcp:0 free_cma:0 > > > > OK, so the allocation couldn't make a forward progress for more than > > 10s. You do not seem to have many file pages on the LRU lists left > > and so you only have anonymous memory as reclaimable. Slab doesn't > > have many pages either. Everything together makes it 1886MB out of 2GB. > > ~50MB is free so this means ~70MB is in unaccounted memory (50MB is > > reserved) which looks reasonably and I wouldn't suspect any kernel > > memory leak > > I don't suspect any kernel memory leak here. > > > And again the anonymous memory pressure grows. So I would suspect some > > userspace application went off the hook and started consuming a lot of > > anonymous memory which gets you to a trashing stage when basically > > nothing can move on much without swap out. The page cache is at its > > minimum and I suspect that most binaries would have to be read from disk > > and you reached the point of trashing. I am afraid we are not really > > great at handling these situations from the kernel well. Killing the > > memory hog would be probably the most sane thing to do. > > > > I don't know what "DOM Worker" process is. But guessing from that there is > "firefox-esr" process, "DOM Worker" is a process related to HTML5 Web Workers API. > Since web browser processes can heavily consume memory depending on the content > loaded (or memory leak of plugins), it is possible that you are overstressing > the system. > > "DMA32 free:" is below "DMA32 min:" which I think means that the OOM killer > would have been triggerred immediately if there is no swap. > > I guess there were other processes which stalled less than 10 seconds. Maybe > processes stalling at doing swap I/O exist, but we can't know them because > warn_alloc() threshold is not configurable and __GFP_NOWARN allocations are > not reported by warn_alloc(). Too bad. > > If you can rebuild your kernel, calling dump_tasks() in mm/oom_kill.c when > you hit warn_alloc() warnings might help. I do not really see how this would help much. If anything watching for /proc/vmstat counters would tell us much more. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>