Re: [PATCH 09/11] xfs: rework kmem_alloc_{io,large} to use GFP_* flags

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 14 Nov 2019 06:56:53 +1100

On Wed, Nov 13, 2019 at 10:08:50AM -0800, Darrick J. Wong wrote:
> On Wed, Nov 13, 2019 at 03:23:33PM +0100, Carlos Maiolino wrote:
> > Pass slab flags directly to these functions
> > 
> > Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
> > ---
> >  fs/xfs/kmem.c                 | 60 ++++-------------------------------
> >  fs/xfs/kmem.h                 |  8 ++---
> >  fs/xfs/libxfs/xfs_attr_leaf.c |  2 +-
> >  fs/xfs/scrub/attr.c           |  8 ++---
> >  fs/xfs/scrub/attr.h           |  3 +-
> >  fs/xfs/xfs_buf.c              |  7 ++--
> >  fs/xfs/xfs_log.c              |  2 +-
> >  fs/xfs/xfs_log_cil.c          |  2 +-
> >  fs/xfs/xfs_log_recover.c      |  3 +-
> >  9 files changed, 22 insertions(+), 73 deletions(-)
> > 
> > diff --git a/fs/xfs/kmem.c b/fs/xfs/kmem.c
> > index 79467813d810..44145293cfc9 100644
> > --- a/fs/xfs/kmem.c
> > +++ b/fs/xfs/kmem.c
> > @@ -8,54 +8,6 @@
> >  #include "xfs_message.h"
> >  #include "xfs_trace.h"
> >  
> > -static void *
> > -__kmem_alloc(size_t size, xfs_km_flags_t flags)
> > -{
> > -	int	retries = 0;
> > -	gfp_t	lflags = kmem_flags_convert(flags);
> > -	void	*ptr;
> > -
> > -	trace_kmem_alloc(size, flags, _RET_IP_);
> > -
> > -	do {
> > -		ptr = kmalloc(size, lflags);
> > -		if (ptr || (flags & KM_MAYFAIL))
> > -			return ptr;
> > -		if (!(++retries % 100))
> > -			xfs_err(NULL,
> > -	"%s(%u) possible memory allocation deadlock size %u in %s (mode:0x%x)",
> > -				current->comm, current->pid,
> > -				(unsigned int)size, __func__, lflags);
> > -		congestion_wait(BLK_RW_ASYNC, HZ/50);
> 
> Why is it ok to remove the "wait for congestion and retry" logic here?
> Does GFP_NOFAIL do all that for us?

Yes - kmalloc will never return failure if GFP_NOFAIL is set, and
it has internal congestion_wait() backoffs. All we lose here is
the XFS specific error message - we'll get the generic warnings
now...

> Same question for the other two patches that remove similar loops from
> other functions.
> 
> > -	} while (1);
> > -}
> > -
> > -
> > -/*
> > - * __vmalloc() will allocate data pages and auxiliary structures (e.g.
> > - * pagetables) with GFP_KERNEL, yet we may be under GFP_NOFS context here. Hence
> > - * we need to tell memory reclaim that we are in such a context via
> > - * PF_MEMALLOC_NOFS to prevent memory reclaim re-entering the filesystem here
> > - * and potentially deadlocking.
> > - */
> > -static void *
> > -__kmem_vmalloc(size_t size, xfs_km_flags_t flags)
> > -{
> > -	unsigned nofs_flag = 0;
> > -	void	*ptr;
> > -	gfp_t	lflags = kmem_flags_convert(flags);
> > -
> > -	if (flags & KM_NOFS)
> > -		nofs_flag = memalloc_nofs_save();
> > -
> > -	ptr = __vmalloc(size, lflags, PAGE_KERNEL);
> > -
> > -	if (flags & KM_NOFS)
> > -		memalloc_nofs_restore(nofs_flag);
> > -
> > -	return ptr;
> > -}
> 
> I think this deletes too much -- as the comment says, a caller can pass
> in GFP_NOFS and we have to push that context all the way through to
> /any/ allocation that happens under __vmalloc.
> 
> However, it's possible that the mm developers have fixed __vmalloc to
> pass GFP_NOFS through to every allocation that can be made.  Is that the
> case?

Nope, vmalloc will still do GFP_KERNEL allocations internally. We
need to keep the kmem_vmalloc wrappers as they stand, just converted
to use GFP flags.

> TBH this series would be better sequenced as (1) get rid of pointless
> wrappers, (2) convert kmem* callers to use GFP flags, and (3) save
> whatever logic changes for the end.  If nothing else I could pull the
> first two parts and leave the third for later.

*nod*

> >  static inline void *
> > -kmem_zalloc_large(size_t size, xfs_km_flags_t flags)
> > +kmem_zalloc_large(size_t size, gfp_t flags)
> >  {
> > -	return kmem_alloc_large(size, flags | KM_ZERO);
> > +	return kmem_alloc_large(size, flags | __GFP_ZERO);
> >  }

I'd also kill these zalloc wrappers and just pass __GFP_ZERO
directly.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx