Re: [PATCH RFC 5/5] non-mm: discourage the usage of __GFP_NOFAIL and encourage GFP_NOFAIL

Michal Hocko <mhocko@xxxxxxxx> · Mon, 29 Jul 2024 13:50:44 +0200



On Fri 26-07-24 14:08:18, Davidlohr Bueso wrote:
> On Thu, 25 Jul 2024, Michal Hocko wrote:\n
> > On Thu 25-07-24 13:38:50, Barry Song wrote:
> > > On Thu, Jul 25, 2024 at 12:17???AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > > >
> > > > On Wed 24-07-24 20:55:44, Barry Song wrote:
> > > > > From: Barry Song <v-songbaohua@xxxxxxxx>
> > > > >
> > > > > GFP_NOFAIL includes the meaning of block and direct reclamation, which
> > > > > is essential for a true no-fail allocation. We are gradually starting
> > > > > to enforce this block semantics to prevent the potential misuse of
> > > > > __GFP_NOFAIL in atomic contexts in the future.
> > > > >
> > > > > A typical example of incorrect usage is in VDPA, where GFP_ATOMIC
> > > > > and __GFP_NOFAIL are used together.
> > > >
> > > > Ohh, so you have done the migration. Please squash those two patches.
> > > > Also if we want to preserve clean __GFP_NOFAIL for internal MM use then it
> > > > should be moved away from include/linux/gfp_types.h. But is there any
> > > > real use for that?
> > > 
> > > yes. currently i got two,
> > > 
> > > lib/rhashtable.c
> > > 
> > > static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
> > >                                                size_t nbuckets,
> > >                                                gfp_t gfp)
> > > {
> > >         struct bucket_table *tbl = NULL;
> > >         size_t size;
> > >         int i;
> > >         static struct lock_class_key __key;
> > > 
> > >         tbl = alloc_hooks_tag(ht->alloc_tag,
> > >                         kvmalloc_node_noprof(struct_size(tbl, buckets,
> > > nbuckets),
> > >                                              gfp|__GFP_ZERO, NUMA_NO_NODE));
> > > 
> > >         size = nbuckets;
> > > 
> > >         if (tbl == NULL && (gfp & ~__GFP_NOFAIL) != GFP_KERNEL) {
> > >                 tbl = nested_bucket_table_alloc(ht, nbuckets, gfp);
> > >                 nbuckets = 0;
> > >         }
> > > 
> > >         ...
> > > 
> > >         return tbl;
> > > }
> > 
> > Ugh. OK this is a weird allocation fallback strategy 2d22ecf6db1c
> > ("lib/rhashtable: guarantee initial hashtable allocation"). Maybe the
> > code should be just simplified and GFP_NOFAIL used from the begining?
> > Davidlohr WDYT? For your context Barry tries to drop all the
> > __GFP_NOFAIL use and replace it by GFP_NOFAIL which enforces
> > __GFP_DIRECT_RECLAIM so that people cannot request atomic NOFAIL.
> 
> Why is it so weird?

Because it is really hard to figure out what it is supposed to mean.
If the caller uses __GFP_NOFAIL then it is (should be) impossible and if
NOFAIL is not used then why does it need to check for 
	(gfp & ~__GFP_NOFAIL) != GFP_KERNEL?
this could be GFP_NO{IO,FS} but also GFP_ATOMIC. So what is it supposed
to mean even?

> Perhaps I'm missing your point, but the fallback
> introduced in that commit attempts to avoid abusing nofail semantics
> and only ask with a smaller size.
> 
> In any case, would the following be better (and also silences smatch)?
> Disregarding the initial nofail request, rhashtable allocations are
> always either regular GFP_KERNEL or GFP_ATOMIC (for the nested and
> some insertion cases).
> 
> -----8<-----
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index dbbed19f8fff..c9f9cce4a3c1 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -184,12 +184,12 @@ static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
>  	static struct lock_class_key __key;
>  	tbl = alloc_hooks_tag(ht->alloc_tag,
> -			kvmalloc_node_noprof(struct_size(tbl, buckets, nbuckets),
> -					     gfp|__GFP_ZERO, NUMA_NO_NODE));
> +			kvmalloc_noprof(struct_size(tbl, buckets, nbuckets),
> +					gfp|__GFP_ZERO));
>  	size = nbuckets;
> -	if (tbl == NULL && (gfp & ~__GFP_NOFAIL) != GFP_KERNEL) {
> +	if (tbl == NULL && (gfp & GFP_ATOMIC)) {

I have really hard time to follow what that is supposed to mean. First
GFP_ATOMIC is not a mask usable for this kind of tests as it is
	__GFP_HIGH|__GFP_KSWAPD_RECLAIM

so GFP_KERNEL & GFP_ATOMIC is true. If you want to explicitly ask for a
sleepable allocation then use gfpflags_allow_blocking but fundamentally
why you simply do not do
	if (!tlb)
  		tbl = nested_bucket_table_alloc(ht, nbuckets, gfp);

Why does gfp flags play any role here?
-- 
Michal Hocko
SUSE Labs