Re: [PATCH v3 0/4] mm: clarify nofail memory allocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 19 Aug 2024 at 06:02, David Hildenbrand <david@xxxxxxxxxx> wrote:
> >
> > If we must still fail a nofail allocation, we should trigger a BUG rather
> > than exposing NULL dereferences to callers who do not check the return
> > value.
>
> I am not convinced that BUG_ON is the right tool here to save the world,
> but I see how we arrived here.

I think the thing to do is to just add a

     WARN_ON_ONCE((flags & __GFP_NOFAIL) && bad_nofail_alloc(oder, flags));

or similar, where that bad_nofail_alloc() checks that the allocation
order is small and that the flags are sane for a NOFAIL allocation.

Because no, BUG_ON() is *never* the answer. The answer is to make sure
nobody ever sets NOFAIL in situations where the allocation can fail
and there is no way forward.

A BUG_ON() will quite likely just make things worse. You're better off
with a WARN_ON() and letting the caller just oops.

Honestly, I'm perfectly fine with just removing that stupid useless
flag entirely. The flag goes back to 2003 and was introduced in
2.5.69, and was meant to be for very particular uses that otherwise
just looped waiting for memory.

Back in 2.5.69, there was exactly one user: the jbd journal code, that
did a buffer head allocation with GFP_NOFAIL.  By 2.6.0 that had
expanded by another user in XFS, and even that one had a comment
saying that it needed to be narrowed down. And in fact, by the 2.6.12
release, that XFS use had been removed, but the jbd journal had grown
another jbd_kmalloc case for transaction data. So at the beginning of
the git archives, we had exactly *one* user (with two places).

*THAT* is the kind of use that the flag was meant for: small
allocations required to make forward progress in writeout during
memory pressure.

It has then expanded and is now a problem. The cases using GFP_NOFAIL
for things like vmalloc() - which is by definition not a small
allocation - should be just removed as outright bugs.

Note that we had this comment back in 2010:

 * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
 * cannot handle allocation failures.  This modifier is deprecated and no new
 * users should be added.

and then it was softened in 2015 to the current

 * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
 * cannot handle allocation failures. New users should be evaluated carefully
  ...

and clearly that "evaluated carefully" actually never happened, so the
new comment is just garbage.

I wonder how many modern users of GFP_NOFAIL are simply due to
over-eager allocation failure injection testing, and then people added
GFP_NOFAIL just because it shut up the mindless random allocation
failures.

I mean, we have a __GFP_NOFAIL in rhashtable_init() - which can
actually return an error just fine, but there was this crazy worry
about the IPC layer initialization failing:

   https://lore.kernel.org/all/20180523172500.anfvmjtumww65ief@linux-n805/

Things like that, where people just added mindless "theoretical
concerns" issues, or possibly had some error injection module that
inserted impossible failures.

I do NOT want those things to become BUG_ON()'s. It's better to just
return NULL with a "bogus GFP_NOFAIL" warning, and have the oops
happen in the actual bad place that did an invalid allocation.

Because the blame should go *there*, and it should not even remotely
look like "oh, the MM code failed". No. The caller was garbage.

So no. No MM BUG_ON code.

                    Linus




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux