On 4/26/20 2:27 AM, Andrew Morton wrote: > On Fri, 24 Apr 2020 13:48:06 -0700 (PDT) David Rientjes <rientjes@xxxxxxxxxx> wrote: > >> If GFP_ATOMIC allocations will start failing soon because the amount of >> free memory is substantially under per-zone min watermarks, it is better >> to oom kill a process rather than continue to reclaim. >> >> This intends to significantly reduce the number of page allocation >> failures that are encountered when the demands of user and atomic >> allocations overwhelm the ability of reclaim to keep up. We can see this >> with a high ingress of networking traffic where memory allocated in irq >> context can overwhelm the ability to reclaim fast enough such that user >> memory consistently loops. In that case, we have reclaimable memory, and > "user memory allocation", I assume? Or maybe "blockable memory > allocatoins". > >> reclaiming is successful, but we've fully depleted memory reserves that >> are allowed for non-blockable allocations. >> >> Commit 400e22499dd9 ("mm: don't warn about allocations which stall for >> too long") removed evidence of user allocations stalling because of this, >> but the situation can apply anytime we get "page allocation failures" >> where reclaim is happening but per-zone min watermarks are starved: >> >> Node 0 Normal free:87356kB min:221984kB low:416984kB high:611984kB active_anon:123009936kB inactive_anon:67647652kB active_file:429612kB inactive_file:209980kB unevictable:112348kB writepending:260kB present:198180864kB managed:195027624kB mlocked:81756kB kernel_stack:24040kB pagetables:11460kB bounce:0kB free_pcp:940kB local_pcp:96kB free_cma:0kB >> lowmem_reserve[]: 0 0 0 0 >> Node 1 Normal free:105616kB min:225568kB low:423716kB high:621864kB active_anon:122124196kB inactive_anon:74112696kB active_file:39172kB inactive_file:103696kB unevictable:204480kB writepending:180kB present:201326592kB managed:198174372kB mlocked:204480kB kernel_stack:11328kB pagetables:3680kB bounce:0kB free_pcp:1140kB local_pcp:0kB free_cma:0kB >> lowmem_reserve[]: 0 0 0 0 >> >> Without this patch, there is no guarantee that user memory allocations >> will ever be successful when non-blockable allocations overwhelm the >> ability to get above per-zone min watermarks. >> >> This doesn't solve page allocation failures entirely since it's a >> preemptive measure based on watermarks that requires concurrent blockable >> allocations to trigger the oom kill. To complete solve page allocation >> failures, it would be possible to do the same watermark check for non- >> blockable allocations and then queue a worker to asynchronously oom kill >> if it finds watermarks to be sufficiently low as well. >> > Well, what's really going on here? > > Is networking potentially consuming an unbounded amount of memory? If > so, then killing a process will just cause networking to consume more > memory then hit against the same thing. So presumably the answer is > "no, the watermarks are inappropriately set for this workload". > > So would it not be sensible to dynamically adjust the watermarks in > response to this condition? Maintain a larger pool of memory for these > allocations? Or possibly push back on networking and tell it to reduce > its queue sizes? So that stuff doesn't keep on getting oom-killed? > I think I seen similar issues when dma-buf allocate a lot. But that is on older kernels and out of tree. So networking is maybe not the only cause. dma-buf are used a lot for camera stuff in android.