Re: [PATCH RFC] mm: warn potential return NULL for kmalloc_array and kvmalloc_array with __GFP_NOFAIL

Barry Song <21cnbao@xxxxxxxxx> · Fri, 19 Jul 2024 12:35:55 +1200

On Thu, Jul 18, 2024 at 8:50 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Thu 18-07-24 20:43:53, Barry Song wrote:
> > On Thu, Jul 18, 2024 at 8:32 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > >
> > > On Thu 18-07-24 20:18:02, Barry Song wrote:
> > > > So the purpose is making sure the semantics - NOFAIL means no failure
> > > > and we don't need to check ret.  If we can't really succeed, it should throw
> > > > a BUG to stop any potential exploits.
> > >
> > > This would require to panic consistently on failure in all allocator
> > > path that can bail out. E.g. page allocator on GFP_NOWAIT|GFP_NOFAIL
> > > req. not sure how many more.
> >
> > Right, this GFP_NOFAIL issue seems quite messy. However, at least vmalloc
> > will retry by itself, even if alloc_pages might have failed with
> > GFP_NOWAIT | GFP_NOFAIL.
> >
> > But isn't that the definition of __GFP_NOFAIL?
> >
> >  * %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
> >  * cannot handle allocation failures. The allocation could block
> >  * indefinitely but will never return with failure. Testing for
> >  * failure is pointless."
> >
> > So I believe any code that doesn't retry and ends up returning NULL should be
> > fixed.
>
> Yes, those shouldn't really fail. NOWAIT|NOFAIL was something that
> should never happen and I really hope it doesn't. Others should really
> retry but it's been some time since I've checked the last time.

I assume allocations directly using alloc_pages() might not respect GFP_NOFAIL
and violate the semantics of GFP_NOFAIL.

static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
                                                struct alloc_context *ac) {
        /*
         * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
         * we always retry
         */
        if (gfp_mask & __GFP_NOFAIL) {
                /*
                 * All existing users of the __GFP_NOFAIL are blockable, so warn
                 * of any new users that actually require GFP_NOWAIT
                 */
                if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
                        goto fail;
...
}

Additionally, at least drivers/vdpa/vdpa_user/iova_domain.c is
incorrect with GFP_ATOMIC
| __GFP_NOFAIL.

void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain)
{
        ...

        count = domain->bounce_size >> PAGE_SHIFT;
        for (i = 0; i < count; i++) {
                ...

                /* Copy user page to kernel page if it's in use */
                if (map->orig_phys != INVALID_PHYS_ADDR) {
                        page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL);
                        memcpy_from_page(page_address(page),
                                         map->bounce_page, 0, PAGE_SIZE);
                }
                put_page(map->bounce_page);
                map->bounce_page = page;
        }
        domain->user_bounce_pages = false;
out:
        write_unlock(&domain->bounce_lock);
}

GFP_NOFAIL things need to be fixed. Let me investigate further.

>
> These overflow checks were added without any acks by MM people...
> --
> Michal Hocko
> SUSE Labs

Thanks
Barry