On Tue 10-07-18 14:07:56, Marc Lehmann wrote: > (I am not subscribed) > > Hi! > > While reporting another (not strictly related) kernel bug > (https://bugzilla.kernel.org/show_bug.cgi?id=199931) I was encouraged to > report my problem here, even though, in my opinion, I don't have enough > hard data for a good bug report, so bear with me, please. > > Basically, the post 4.4 VM system (I think my troubles started around 4.6 > or 4.7) is nearly unusable on all of my (very different) systems that > actually do some work, with symptoms being frequent OOM kills with many > gigabytes of available memory, extended periods of semi-freezing with > thrashing, and apparent hard lockups, almost certainly related to memory > usage. JFTR, we have discussed that off-list and Marc has provided on example oom report: [48190.574505] nvidia-modeset invoked oom-killer: gfp_mask=0x14040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null), order=3, oom_score_adj=0 [48190.574508] nvidia-modeset cpuset=/ mems_allowed=0 [...] [48190.574769] active_anon:960260 inactive_anon:175381 isolated_anon:0 active_file:1061865 inactive_file:177006 isolated_file:0 unevictable:0 dirty:273 writeback:0 unstable:0 slab_reclaimable:1519864 slab_unreclaimable:61079 mapped:31182 shmem:11064 pagetables:23135 bounce:0 free:53178 free_pcp:68 free_cma:0 [...] [48190.574783] Node 0 DMA: 0*4kB 2*8kB (U) 3*16kB (U) 2*32kB (U) 2*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15872kB [48190.574787] Node 0 DMA32: 2015*4kB (UME) 4517*8kB (UME) 5301*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 129012kB [48190.574791] Node 0 Normal: 6379*4kB (UME) 2915*8kB (UE) 1266*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 69092kB We are out of order-3+ oroders in all eligible zones (please note that DMA zone is not really usable for this request). Different kernel versions have slightly different implementation of the compaction so they might behave differently but once it cannot make any progress then we are out of luck. It is quite unfortunate that nvidia really insists on having order-3 allocation. Maybe it can use kvmalloc or use __GFP_RETRY_MAYFAIL in current kernels. It is quite surprising we have so mach memory yet we are not able to find order-3 contiguous block. This smells suspicious. You have previously mentioned that dropping cache helped. So I assume that fs metadata are fragmenting the memory. Anyway, I will go over your whole report later. I am quite busy right now. Thanks for the report! -- Michal Hocko SUSE Labs