Re: [PATCH] mm: document risk of PF_MEMALLOC_NORECLAIM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat 17-08-24 10:29:31, Yafang Shao wrote:
> On Fri, Aug 16, 2024 at 4:17 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > Andrew, could you merge the following before PF_MEMALLOC_NORECLAIM can
> > be removed from the tree altogether please? For the full context the
> > email thread starts here: https://lore.kernel.org/all/20240812090525.80299-1-laoar.shao@xxxxxxxxx/T/#u
> > ---
> > From f17d36975ec343d9388aa6dbf9ca8d1b58ed09ce Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@xxxxxxxx>
> > Date: Fri, 16 Aug 2024 10:10:00 +0200
> > Subject: [PATCH] mm: document risk of PF_MEMALLOC_NORECLAIM
> >
> > PF_MEMALLOC_NORECLAIM has been added even when it was pointed out [1]
> > that such a allocation contex is inherently unsafe if the context
> > doesn't fully control all allocations called from this context. Any
> > potential __GFP_NOFAIL request from withing PF_MEMALLOC_NORECLAIM
> > context would BUG_ON if the allocation would fail.
> >
> > [1] https://lore.kernel.org/all/ZcM0xtlKbAOFjv5n@tiehlicka/
> >
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> 
> Documenting the risk is a good first step. For this change:
> 
> Acked-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> 
> Even without the PF_MEMALLOC_NORECLAIM flag, the underlying risk
> remains, as users can still potentially set both ~__GPF_DIRECT_RECLAIM
> and __GFP_NOFAIL.

Users can configure all sorts of nonsensical gfp flags combination. That
is a sad reality of the interface. But we do assume that kernel code is
somehow sane.

Besides that Barry is working on making this less likely by droppong
__GFP_NOFAIL and replace it by GFP_NOFAIL which always includes
__GFP_DIRECT_RECLAIM. Sure nothing will prevent callers from clearing
that flag explicitly but we have no real defense afains broken code.

> PF_MEMALLOC_NORECLAIM does not create this risk; it
> only exacerbates it. The core problem lies in the complexity of the
> various GFP flags and the lack of robust management for them. While we
> have extensive documentation on these flags, it can still be
> confusing, particularly for new developers who haven't yet encountered
> real-world issues.
> 
> For instance:
> 
>   * %GFP_NOWAIT is for kernel allocations that should not stall for direct
>   * reclaim,
>   #define GFP_NOWAIT      (__GFP_KSWAPD_RECLAIM | __GFP_NOWARN)
> 
> Initially, it wasn't clear to me why setting __GFP_KSWAPD_RECLAIM and
> __GFP_NOWARN would prevent direct reclaim. It only became apparent
> after I studied the entire code path of page allocation. I believe
> other newcomers to kernel development may face similar confusion as I
> did early in my experience.
> 
> The real issue we need to address is improving the management of these
> GFP flags, though I don't have a concrete solution at this time.

Welcome to the club. Changing this interface is a _huge_ undertaking.
Just have a look how many users of the gfp flags we have in the kernel.
I can tell you from a first hand experience that even minor tweaks are
really hard to make.
-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux