On Thu, 22 Aug 2024 at 17:16, Michal Hocko <mhocko@xxxxxxxx> wrote: > > GFP_USER allocation only impluy __GFP_HARDWALL and that only makes > difference for cpusets. It doesn't make difference in most cases though. That's what it does today. We used to have a very clear notion of "how hard to try". It was "LOW", "MED" and "HIGH". And GFP_USER used __GFP_LOW, exactly so that the MM layer knew to not try very hard. GFP_ATOMIC used __GFP_HIGH, to say "use the reserved resources". GFP_KERNEL then at one point used __GF_MED, to say "don't dip into the reserved pool, but retry harder". But exactly because people did want kernel allocations to basically always succeed, then GFP_KERNEL ended up using __GFP_HIGH too. > There is a fundamental difference here. GPF_NOFAIL _guarantees_ that the > allocation will not fail so callers do not check for the failure because > they have (presumably) no (practical) way to handle the failure. And this mindset needs to go away. That's what I've been trying to say. It absolutely MUST NOT GUARANTEE THAT. I've seen crap patche that say "BUG_ON() if we cannot guarantee it", and I'm NACKing those kinds of completely bogus models. The hard reality needs to be that GFP_NOFAIL is simply IGNORED if people mis-use it. It absolutely HAS to be a "conditional no-failure". And it needs to be conditional on both size and things like "I'm allowed to do reclaim". Any discussion that starts with "GFP_NOFAIL is a guarantee" needs to *DIE*. Linus