On 7/27/23 01:38, Roman Gushchin wrote: > On Wed, Jul 26, 2023 at 10:53:04AM -0400, Johannes Weiner wrote: >> On a memcache setup with heavy anon usage and no swap, we routinely >> see premature OOM kills with multiple gigabytes of free space left: >> >> Node 0 Normal free:4978632kB [...] free_cma:4893276kB >> >> This free space turns out to be CMA. We set CMA regions aside for >> potential hugetlb users on all of our machines, figuring that even if >> there aren't any, the memory is available to userspace allocations. >> >> When the OOMs trigger, it's from unmovable and reclaimable allocations >> that aren't allowed to dip into CMA. The non-CMA regions meanwhile are >> dominated by the anon pages. >> >> >> Because we have more options for CMA pages, change the policy to >> always fill up CMA first. This reduces the risk of premature OOMs. > > I suspect it might cause regressions on small(er) devices where > a relatively small cma area (Mb's) is often reserved for a use by various > device drivers, which can't handle allocation failures well (even interim > allocation failures). A startup time can regress too: migrating pages out of > cma will take time. Agreed, we should be more careful here. > And given the velocity of kernel upgrades on such devices, we won't learn about > it for next couple of years. > >> Movable pages can be migrated out of CMA when necessary, but we don't >> have a mechanism to migrate them *into* CMA to make room for unmovable >> allocations. The only recourse we have for these pages is reclaim, >> which due to a lack of swap is unavailable in our case. > > Idk, should we introduce such a mechanism? Or use some alternative heuristics, > which will be a better compromise between those who need cma allocations always > pass and those who use large cma areas for opportunistic huge page allocations. > Of course, we can add a boot flag/sysctl/per-cma-area flag, but I doubt we want > really this. At some point the solution was supposed to be ZONE_MOVABLE: https://lore.kernel.org/linux-mm/1512114786-5085-1-git-send-email-iamjoonsoo.kim@xxxxxxx/ But it was reverted due to IIRC some bugs, and Joonsoo going MIA. > Thanks!