Re: [PATCH RFC] mm: warn potential return NULL for kmalloc_array and kvmalloc_array with __GFP_NOFAIL

Barry Song <21cnbao@xxxxxxxxx> · Fri, 19 Jul 2024 19:51:06 +1200

On Fri, Jul 19, 2024 at 7:42 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Fri 19-07-24 19:07:31, Barry Song wrote:
> > On Fri, Jul 19, 2024 at 7:02 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > >
> > > On Fri 19-07-24 12:35:55, Barry Song wrote:
> > > > On Thu, Jul 18, 2024 at 8:50 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > > [...]
> > > > > Yes, those shouldn't really fail. NOWAIT|NOFAIL was something that
> > > > > should never happen and I really hope it doesn't. Others should really
> > > > > retry but it's been some time since I've checked the last time.
> > > >
> > > >
> > > > I assume allocations directly using alloc_pages() might not respect GFP_NOFAIL
> > > > and violate the semantics of GFP_NOFAIL.
> > >
> > > What do you mean?
> >
> > I mean, if we are using wrappers like vmalloc (GFP_NOFAIL | GFP_NOWAIT),
> > though alloc_pages might return NULL, vmalloc for itself will retry.
>
> vmalloc(NOFAIL|NOWAIT) is equally unsupported. This combination of flags
> simply cannot be delivered.
>
> > but if you call alloc_pages() directly with GFP_NOFAIL | GFP_NOWAIT,
> > alloc_pages() may return NULL without retry at all. I believe alloc_pages()
> > is also wrong.
>
> It cannot reclaim itself and it cannot sleep to wait for the memory so
> NOFAIL semantic is simply impossible. We have put a warning in place to

this is still "right" behaviour to retry infinitely at least according
to the doc of
__GFP_NOFAIL. I assume getting new memory by many retries is still
possibly some other processes might be reclaiming or freeing memory
then providing free memory to this one being stuck.

 * %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
 * cannot handle allocation failures. The allocation could block
 * indefinitely but will never return with failure. Testing for
 * failure is pointless.

> catch abusers but apparently this hasn't been sufficient. There are only
> two ways to deal with the failure. Either return NULL and break the
> contract and see what happens (implementation now) or BUG_ON and blow up
> later if the the failed allocation request blows up - potentially
> recoverably. Linus tends to be against adding new BUG() calls unless the
> failure is absolutely unrecoverable (e.g. corrupted data structures
> etc.). I am not sure how he would look at simply incorrect memory
> allocator usage to blow up the kernel. Now the argument could be made
> that those failures could cause subtle memory corruptions or even be
> exploitable which might be a sufficient reason to stop them early. You
> can try that.
>
> I do not see a saner way to deal with this particular memory request
> type. Unless we require all __GFP_NOFAIL|GFP_NOWAIT requests to check
> for the failure but this makes very little sense to me.
>
> --
> Michal Hocko
> SUSE Labs