On Fri, 14 Aug 2020 09:49:26 -0700 Doug Berger <opendmb@xxxxxxxxx> wrote: > The lowmem_reserve arrays provide a means of applying pressure > against allocations from lower zones that were targeted at > higher zones. Its values are a function of the number of pages > managed by higher zones and are assigned by a call to the > setup_per_zone_lowmem_reserve() function. > > The function is initially called at boot time by the function > init_per_zone_wmark_min() and may be called later by accesses > of the /proc/sys/vm/lowmem_reserve_ratio sysctl file. > > The function init_per_zone_wmark_min() was moved up from a > module_init to a core_initcall to resolve a sequencing issue > with khugepaged. Unfortunately this created a sequencing issue > with CMA page accounting. > > The CMA pages are added to the managed page count of a zone > when cma_init_reserved_areas() is called at boot also as a > core_initcall. This makes it uncertain whether the CMA pages > will be added to the managed page counts of their zones before > or after the call to init_per_zone_wmark_min() as it becomes > dependent on link order. With the current link order the pages > are added to the managed count after the lowmem_reserve arrays > are initialized at boot. > > This means the lowmem_reserve values at boot may be lower than > the values used later if /proc/sys/vm/lowmem_reserve_ratio is > accessed even if the ratio values are unchanged. > > In many cases the difference is not significant, but for example > an ARM platform with 1GB of memory and the following memory layout > [ 0.000000] cma: Reserved 256 MiB at 0x0000000030000000 > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000000000000-0x000000002fffffff] > [ 0.000000] Normal empty > [ 0.000000] HighMem [mem 0x0000000030000000-0x000000003fffffff] > > would result in 0 lowmem_reserve for the DMA zone. This would allow > userspace to deplete the DMA zone easily. Sounds fairly serious for thos machines. Was a cc:stable considered? > Funnily enough > $ cat /proc/sys/vm/lowmem_reserve_ratio > would fix up the situation because it forces > setup_per_zone_lowmem_reserve as a side effect. > > This commit breaks the link order dependency by invoking > init_per_zone_wmark_min() as a postcore_initcall so that the > CMA pages have the chance to be properly accounted in their > zone(s) and allowing the lowmem_reserve arrays to receive > consistent values. >