On Mon 27-04-20 16:35:58, Andrew Morton wrote: [...] > No consumer of GFP_ATOMIC memory should consume an unbounded amount of > it. > Subsystems such as networking will consume a certain amount and > will then start recycling it. The total amount in-flight will vary > over the longer term as workloads change. A dynamically tuning > threshold system will need to adapt rapidly enough to sudden load > shifts, which might require unreasonable amounts of headroom. I do agree. __GFP_HIGH/__GFP_ATOMIC are bound by the size of the reserves under memory pressure. Then allocatios start failing very quickly and users have to cope with that, usually by deferring to a sleepable context. Tuning reserves dynamically for heavy reserves consumers would be possible but I am worried that this is far from trivial. We definitely need to understand what is going on here. Why doesn't kswapd + N*direct reclaimers do not provide enough memory to satisfy both N threads + reserves consumers? How many times those direct reclaimers have to retry? We used to have the allocation stall warning as David mentioned in the patch description and I have seen it triggering without heavy reserves consumers (aka reported free pages corresponded to the min watermark). The underlying problem was usually kswapd being stuck on some FS locks, direct reclaimers stuck in shrinkers or way too overloaded system with dozens if not hundreds of processes stuck in the page allocator each racing with the reclaim and betting on luck. The last problem was the most annoying because it is really hard to tune for. -- Michal Hocko SUSE Labs