On Thu 13-08-20 10:55:17, Doug Berger wrote: [...] > One example might be a 1GB arm platform that defines a 256MB default CMA > region. The default zones might map as follows: > [ 0.000000] cma: Reserved 256 MiB at 0x0000000030000000 > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000000000000-0x000000002fffffff] > [ 0.000000] Normal empty > [ 0.000000] HighMem [mem 0x0000000030000000-0x000000003fffffff] [...] > > Here you can see that the lowmem_reserve array for the DMA zone is all > 0's. This is because the HighMem zone is consumed by the CMA region > whose pages haven't been activated to increase the zone managed count > when init_per_zone_wmark_min() is invoked at boot. > > If we access the /proc/sys/vm/lowmem_reserve_ratio sysctl with: > # cat /proc/sys/vm/lowmem_reserve_ratio > 256 32 0 0 Yes, this is really an unexpected behavior. [...] > Here the lowmem_reserve back pressure for the DMA zone for allocations > that target the HighMem zone is now 256 pages. Now 1MB is still not a > lot of additional back pressure, but the watermarks on the HighMem zone > aren't very large either so User space allocations can easily start > consuming the DMA zone while kswapd starts trying to reclaim space in > HighMem. This excess pressure on DMA zone memory can potentially lead to > earlier triggers of OOM Killer and/or kernel fallback allocations into > CMA Movable pages which can interfere with the ability of CMA to obtain > larger size contiguous allocations. > > All of that said, my main concern is that I don't like the inconsistency > between the boot time and run time results. Thanks for the clarification. I would suggest extending your changlog by the following. " In many cases the difference is not significant, but for example an ARM platform with 1GB of memory and the following memory layout [ 0.000000] cma: Reserved 256 MiB at 0x0000000030000000 [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000000000000-0x000000002fffffff] [ 0.000000] Normal empty [ 0.000000] HighMem [mem 0x0000000030000000-0x000000003fffffff] would result in 0 lowmem_reserve for the DMA zone. This would allow userspace the deplete the DMA zone easily. Funnily enough $ cat /proc/sys/vm/lowmem_reserve_ratio would fix up the situation because it forces setup_per_zone_lowmem_reserve as a side effect. " With that feel free to add Acked-by: Michal Hocko <mhocko@xxxxxxxx. Thanks! -- Michal Hocko SUSE Labs