Re: [PATCH v3 0/4] mm: clarify nofail memory allocation

Michal Hocko <mhocko@xxxxxxxx> · Thu, 22 Aug 2024 11:16:39 +0200

On Thu 22-08-24 17:08:15, Linus Torvalds wrote:
> On Thu, 22 Aug 2024 at 16:39, David Hildenbrand <david@xxxxxxxxxx> wrote:
> >
> > Linus has a point that "retry forever" can also be nasty. I think the
> > important part here is, though, that we report sufficient information
> > (stacktrace), such that the problem can be debugged reasonably well, and
> > not just having a locked-up system.
> 
> Unless I missed some case, I *think* most NOFAIL cases are actually
> fairly small.
> 
> In fact, I suspect many of them are so small that we already
> effectively give that guarantee:
> 
> > But then again, sizeof(struct resource) is probably so small that it
> > likely would never fail.
> 
> Iirc, we had the policy of never failing unrestricted kernel
> allocations that are smaller than a page (where "unrestricted" means
> that it's a regular GFP_KERNEL, not some NOFS or similar allocation).
> 
> In fact, I think we practically speaking still do. We really *really*
> tend to try very hard to retry small allocations.

yes we try very hard but allocation failure is still possible in some
corner cases so callers _must_ check for return value and deal with it.

> That was one of the things that GFP_USER does - it's identical to
> GFP_KERNEL, but it basically tells the MM that it should not try so
> hard because an allocation failure was fine.

GFP_USER allocation only impluy __GFP_HARDWALL and that only makes
difference for cpusets. It doesn't make difference in most cases though.

> In fact, kernel allocations try so hard that we have those "opposite
> flags" of ___GFP_NORETRY and ___GFP_RETRY_MAYFAIL because we often try
> *TOO* hard, and reasonably many code-paths have that whole "let's
> optimistically ask for a big allocation, but not try very hard and not
> warn if it fails, because we can fall back on a smaller one".
> 
> So it's _really_ hard to fail a small GFP_KERNEL allocation. It used
> to be practically impossible, and in fact I think GFP_NOFAIL was
> originally added long ago when the MM code was going through big
> upheavals and one of the things that was mucked around with was the
> whole "how hard to retry".

There is a fundamental difference here. GPF_NOFAIL _guarantees_ that the
allocation will not fail so callers do not check for the failure because
they have (presumably) no (practical) way to handle the failure.
-- 
Michal Hocko
SUSE Labs