Re: [PATCH 1/2] mm: Add memalloc_nowait_{save,restore}

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 14, 2024 at 8:43 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Wed 14-08-24 16:12:27, Yafang Shao wrote:
> > On Wed, Aug 14, 2024 at 3:42 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > >
> > > On Mon 12-08-24 20:59:53, Yafang Shao wrote:
> > > > On Mon, Aug 12, 2024 at 7:37 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Mon, Aug 12, 2024 at 05:05:24PM +0800, Yafang Shao wrote:
> > > > > > The PF_MEMALLOC_NORECLAIM flag was introduced in commit eab0af905bfc
> > > > > > ("mm: introduce PF_MEMALLOC_NORECLAIM, PF_MEMALLOC_NOWARN"). To complement
> > > > > > this, let's add two helper functions, memalloc_nowait_{save,restore}, which
> > > > > > will be useful in scenarios where we want to avoid waiting for memory
> > > > > > reclamation.
> > > > >
> > > > > No, forcing nowait on callee contets is just asking for trouble.
> > > > > Unlike NOIO or NOFS this is incompatible with NOFAIL allocations
> > > >
> > > > I don’t see any incompatibility in __alloc_pages_slowpath(). The
> > > > ~__GFP_DIRECT_RECLAIM flag only ensures that direct reclaim is not
> > > > performed, but it doesn’t prevent the allocation of pages from
> > > > ALLOC_MIN_RESERVE, correct?
> > >
> > > Right but this means that you just made any potential nested allocation
> > > within the scope that is GFP_NOFAIL a busy loop essentially. Not to
> > > mention it BUG_ON as non-sleeping GFP_NOFAIL allocations are
> > > unsupported. I believe this is what Christoph had in mind.
> >
> > If that's the case, I believe we should at least consider adding the
> > following code change to the kernel:
>
> We already do have that
>                 /*
>                  * All existing users of the __GFP_NOFAIL are blockable, so warn
>                  * of any new users that actually require GFP_NOWAIT
>                  */
>                 if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
>                         goto fail;

I don't see a reason to place the `goto fail;` above the
`__alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE, ac);`
line. Since we've already woken up kswapd, it should be acceptable to
allocate memory from ALLOC_MIN_RESERVE temporarily. Why not consider
implementing the following changes instead?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9ecf99190ea2..598d4df829cd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4386,13 +4386,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned
int order,
         * we always retry
         */
        if (gfp_mask & __GFP_NOFAIL) {
-               /*
-                * All existing users of the __GFP_NOFAIL are blockable, so warn
-                * of any new users that actually require GFP_NOWAIT
-                */
-               if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
-                       goto fail;
-
                /*
                 * PF_MEMALLOC request from this context is rather bizarre
                 * because we cannot reclaim anything and only can loop waiting
@@ -4419,6 +4412,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned
int order,
                if (page)
                        goto got_pg;

+               /*
+                * All existing users of the __GFP_NOFAIL are blockable, so warn
+                * of any new users that actually require GFP_NOWAIT
+                */
+               if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) {
+                       goto fail;
+               }
+
                cond_resched();
                goto retry;
        }

>
> But Barry has patches to turn that into BUG because failing NOFAIL
> allocations is not cool and cause unexpected failures. Have a look at
> https://lore.kernel.org/all/20240731000155.109583-1-21cnbao@xxxxxxxxx/
>
> > > I am really
> > > surprised that we even have PF_MEMALLOC_NORECLAIM in the first place!
> >
> > There's use cases for it.
>
> Right but there are certain constrains that we need to worry about to
> have a maintainable code. Scope allocation contrains are really a good
> feature when that has a well defined semantic. E.g. NOFS, NOIO or
> NOMEMALLOC (although this is more self inflicted injury exactly because
> PF_MEMALLOC had a "use case"). NOWAIT scope semantic might seem a good
> feature but it falls appart on nested NOFAIL allocations! So the flag is
> usable _only_ if you fully control the whole scoped context. Good luck
> with that long term! This is fragile, hard to review and even harder to
> keep working properly. The flag would have been Nacked on that ground.
> But nobody asked...

It's already implemented, and complaints won't resolve the issue. How
about making the following change to provide a warning when this new
flag is used incorrectly?

diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 4fbae0013166..5a1e1bcde347 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -267,9 +267,10 @@ static inline gfp_t current_gfp_context(gfp_t flags)
                 * Stronger flags before weaker flags:
                 * NORECLAIM implies NOIO, which in turn implies NOFS
                 */
-               if (pflags & PF_MEMALLOC_NORECLAIM)
+               if (pflags & PF_MEMALLOC_NORECLAIM) {
                        flags &= ~__GFP_DIRECT_RECLAIM;
-               else if (pflags & PF_MEMALLOC_NOIO)
+                       WARN_ON_ONCE_GFP(flags & __GFP_NOFAIL, flags)
+               } else if (pflags & PF_MEMALLOC_NOIO)
                        flags &= ~(__GFP_IO | __GFP_FS);
                else if (pflags & PF_MEMALLOC_NOFS)
                        flags &= ~__GFP_FS;

--
Regards


Yafang





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux