On Wed, Aug 14, 2024 at 8:43 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Wed 14-08-24 16:12:27, Yafang Shao wrote: > > On Wed, Aug 14, 2024 at 3:42 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > On Mon 12-08-24 20:59:53, Yafang Shao wrote: > > > > On Mon, Aug 12, 2024 at 7:37 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > > > > > > > > > On Mon, Aug 12, 2024 at 05:05:24PM +0800, Yafang Shao wrote: > > > > > > The PF_MEMALLOC_NORECLAIM flag was introduced in commit eab0af905bfc > > > > > > ("mm: introduce PF_MEMALLOC_NORECLAIM, PF_MEMALLOC_NOWARN"). To complement > > > > > > this, let's add two helper functions, memalloc_nowait_{save,restore}, which > > > > > > will be useful in scenarios where we want to avoid waiting for memory > > > > > > reclamation. > > > > > > > > > > No, forcing nowait on callee contets is just asking for trouble. > > > > > Unlike NOIO or NOFS this is incompatible with NOFAIL allocations > > > > > > > > I don’t see any incompatibility in __alloc_pages_slowpath(). The > > > > ~__GFP_DIRECT_RECLAIM flag only ensures that direct reclaim is not > > > > performed, but it doesn’t prevent the allocation of pages from > > > > ALLOC_MIN_RESERVE, correct? > > > > > > Right but this means that you just made any potential nested allocation > > > within the scope that is GFP_NOFAIL a busy loop essentially. Not to > > > mention it BUG_ON as non-sleeping GFP_NOFAIL allocations are > > > unsupported. I believe this is what Christoph had in mind. > > > > If that's the case, I believe we should at least consider adding the > > following code change to the kernel: > > We already do have that > /* > * All existing users of the __GFP_NOFAIL are blockable, so warn > * of any new users that actually require GFP_NOWAIT > */ > if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) > goto fail; I don't see a reason to place the `goto fail;` above the `__alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE, ac);` line. Since we've already woken up kswapd, it should be acceptable to allocate memory from ALLOC_MIN_RESERVE temporarily. Why not consider implementing the following changes instead? diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9ecf99190ea2..598d4df829cd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4386,13 +4386,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, * we always retry */ if (gfp_mask & __GFP_NOFAIL) { - /* - * All existing users of the __GFP_NOFAIL are blockable, so warn - * of any new users that actually require GFP_NOWAIT - */ - if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) - goto fail; - /* * PF_MEMALLOC request from this context is rather bizarre * because we cannot reclaim anything and only can loop waiting @@ -4419,6 +4412,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, if (page) goto got_pg; + /* + * All existing users of the __GFP_NOFAIL are blockable, so warn + * of any new users that actually require GFP_NOWAIT + */ + if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) { + goto fail; + } + cond_resched(); goto retry; } > > But Barry has patches to turn that into BUG because failing NOFAIL > allocations is not cool and cause unexpected failures. Have a look at > https://lore.kernel.org/all/20240731000155.109583-1-21cnbao@xxxxxxxxx/ > > > > I am really > > > surprised that we even have PF_MEMALLOC_NORECLAIM in the first place! > > > > There's use cases for it. > > Right but there are certain constrains that we need to worry about to > have a maintainable code. Scope allocation contrains are really a good > feature when that has a well defined semantic. E.g. NOFS, NOIO or > NOMEMALLOC (although this is more self inflicted injury exactly because > PF_MEMALLOC had a "use case"). NOWAIT scope semantic might seem a good > feature but it falls appart on nested NOFAIL allocations! So the flag is > usable _only_ if you fully control the whole scoped context. Good luck > with that long term! This is fragile, hard to review and even harder to > keep working properly. The flag would have been Nacked on that ground. > But nobody asked... It's already implemented, and complaints won't resolve the issue. How about making the following change to provide a warning when this new flag is used incorrectly? diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 4fbae0013166..5a1e1bcde347 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -267,9 +267,10 @@ static inline gfp_t current_gfp_context(gfp_t flags) * Stronger flags before weaker flags: * NORECLAIM implies NOIO, which in turn implies NOFS */ - if (pflags & PF_MEMALLOC_NORECLAIM) + if (pflags & PF_MEMALLOC_NORECLAIM) { flags &= ~__GFP_DIRECT_RECLAIM; - else if (pflags & PF_MEMALLOC_NOIO) + WARN_ON_ONCE_GFP(flags & __GFP_NOFAIL, flags) + } else if (pflags & PF_MEMALLOC_NOIO) flags &= ~(__GFP_IO | __GFP_FS); else if (pflags & PF_MEMALLOC_NOFS) flags &= ~__GFP_FS; -- Regards Yafang