Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> · Fri, 2 Oct 2020 09:06:24 +0100

On Thu, Oct 01, 2020 at 09:26:26PM +0200, Uladzislau Rezki wrote:
> > 
> > No, I meant going back to idea of new gfp flag, but adjust the implementation in
> > the allocator (different from what you posted in previous version) so that it
> > only looks at the flag after it tries to allocate from pcplist and finds out
> > it's empty. So, no inventing of new page allocator entry points or checks such
> > as the one you wrote above, but adding the new gfp flag in a way that it doesn't
> > affect existing fast paths.
> >
> OK. Now i see. Please have a look below at the patch, so we fully understand
> each other. If that is something that is close to your view or not:
> 
> <snip>
> t a/include/linux/gfp.h b/include/linux/gfp.h
> index c603237e006c..7e613560a502 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -39,8 +39,9 @@ struct vm_area_struct;
>  #define ___GFP_HARDWALL                0x100000u
>  #define ___GFP_THISNODE                0x200000u
>  #define ___GFP_ACCOUNT         0x400000u
> +#define ___GFP_NO_LOCKS                0x800000u
>  #ifdef CONFIG_LOCKDEP
> -#define ___GFP_NOLOCKDEP       0x800000u
> +#define ___GFP_NOLOCKDEP       0x1000000u
>  #else
>  #define ___GFP_NOLOCKDEP       0
>  #endif
> @@ -215,16 +216,22 @@ struct vm_area_struct;
>   * %__GFP_COMP address compound page metadata.
>   *
>   * %__GFP_ZERO returns a zeroed page on success.
> + *
> + * %__GFP_NO_LOCKS order-0 allocation without sleepable-locks.
> + * It obtains a page from the per-cpu-list and considered as
> + * lock-less. No other actions are performed, thus it returns
> + * NULL if per-cpu-list is empty.
>   */
>  #define __GFP_NOWARN   ((__force gfp_t)___GFP_NOWARN)
>  #define __GFP_COMP     ((__force gfp_t)___GFP_COMP)
>  #define __GFP_ZERO     ((__force gfp_t)___GFP_ZERO)
> +#define __GFP_NO_LOCKS ((__force gfp_t)___GFP_NO_LOCKS)
> 

I'm not a fan of the GFP flag approach simply because we've had cases
before where GFP flags were used in inappropriate contexts like
__GFP_MEMALLOC which led to a surprising amount of bugs, particularly
from out-of-tree drivers but also in-tree drivers. Of course, there
are limited GFP flags available too but at least the comment should
be as robust as possible. Maybe something like

 * %__GFP_NO_LOCKS attempts order-0 allocation without sleepable-locks. It
 * attempts to obtain a page without acquiring any spinlocks. This
 * should only be used in a context where the holder holds a
 * raw_spin_lock that cannot be released for the allocation request.
 * This may be necessary in PREEMPT_RT kernels where a
 * raw_spin_lock is held which does not sleep tries to acquire a
 * spin_lock that can sleep with PREEMPT_RT. This should not be
 * confused with GFP_ATOMIC contexts. Like atomic allocation
 * requests, there is no guarantee a page will be returned and
 * the caller must be able to deal with allocation failures.
 * The risk of allocation failure is higher than using GFP_ATOMIC.

It's verbose but it would be hard to misinterpret. I think we're
going to go through a period of time before people get familiar
with PREEMPT_RT-related hazards as various comments that were
true are going to be misleading for a while.

For anyone reviewing, any use of __GFP_NO_LOCKS should meet a high
standard where there is no alternative except to use the flags. i.e. a
higher standard "but I'm an important driver".

-- 
Mel Gorman
SUSE Labs