Re: [PATCH RFC] mm: warn potential return NULL for kmalloc_array and kvmalloc_array with __GFP_NOFAIL

Michal Hocko <mhocko@xxxxxxxx> · Fri, 19 Jul 2024 09:42:02 +0200

On Fri 19-07-24 19:07:31, Barry Song wrote:
> On Fri, Jul 19, 2024 at 7:02 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Fri 19-07-24 12:35:55, Barry Song wrote:
> > > On Thu, Jul 18, 2024 at 8:50 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > [...]
> > > > Yes, those shouldn't really fail. NOWAIT|NOFAIL was something that
> > > > should never happen and I really hope it doesn't. Others should really
> > > > retry but it's been some time since I've checked the last time.
> > >
> > >
> > > I assume allocations directly using alloc_pages() might not respect GFP_NOFAIL
> > > and violate the semantics of GFP_NOFAIL.
> >
> > What do you mean?
> 
> I mean, if we are using wrappers like vmalloc (GFP_NOFAIL | GFP_NOWAIT),
> though alloc_pages might return NULL, vmalloc for itself will retry.

vmalloc(NOFAIL|NOWAIT) is equally unsupported. This combination of flags
simply cannot be delivered.

> but if you call alloc_pages() directly with GFP_NOFAIL | GFP_NOWAIT,
> alloc_pages() may return NULL without retry at all. I believe alloc_pages()
> is also wrong.

It cannot reclaim itself and it cannot sleep to wait for the memory so
NOFAIL semantic is simply impossible. We have put a warning in place to
catch abusers but apparently this hasn't been sufficient. There are only
two ways to deal with the failure. Either return NULL and break the
contract and see what happens (implementation now) or BUG_ON and blow up
later if the the failed allocation request blows up - potentially
recoverably. Linus tends to be against adding new BUG() calls unless the
failure is absolutely unrecoverable (e.g. corrupted data structures
etc.). I am not sure how he would look at simply incorrect memory
allocator usage to blow up the kernel. Now the argument could be made
that those failures could cause subtle memory corruptions or even be
exploitable which might be a sufficient reason to stop them early. You
can try that.

I do not see a saner way to deal with this particular memory request
type. Unless we require all __GFP_NOFAIL|GFP_NOWAIT requests to check
for the failure but this makes very little sense to me.

-- 
Michal Hocko
SUSE Labs