On Thu, Aug 29, 2024 at 11:12:18PM GMT, Dave Chinner wrote: > On Thu, Aug 29, 2024 at 06:02:32AM -0400, Kent Overstreet wrote: > > On Wed, Aug 28, 2024 at 02:09:57PM GMT, Dave Chinner wrote: > > > On Tue, Aug 27, 2024 at 08:15:43AM +0200, Michal Hocko wrote: > > > > From: Michal Hocko <mhocko@xxxxxxxx> > > > > > > > > bch2_new_inode relies on PF_MEMALLOC_NORECLAIM to try to allocate a new > > > > inode to achieve GFP_NOWAIT semantic while holding locks. If this > > > > allocation fails it will drop locks and use GFP_NOFS allocation context. > > > > > > > > We would like to drop PF_MEMALLOC_NORECLAIM because it is really > > > > dangerous to use if the caller doesn't control the full call chain with > > > > this flag set. E.g. if any of the function down the chain needed > > > > GFP_NOFAIL request the PF_MEMALLOC_NORECLAIM would override this and > > > > cause unexpected failure. > > > > > > > > While this is not the case in this particular case using the scoped gfp > > > > semantic is not really needed bacause we can easily pus the allocation > > > > context down the chain without too much clutter. > > > > > > > > Acked-by: Christoph Hellwig <hch@xxxxxx> > > > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > > > > > > Looks good to me. > > > > > > Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> > > > > Reposting what I wrote in the other thread: > > I've read the thread. I've heard what you have had to say. Like > several other people, I think your position is just not practical or > reasonable. > > I don't care about the purity or the safety of the API - the > practical result of PF_MEMALLOC_NORECLAIM is that __GFP_NOFAIL > allocation can now fail and that will cause unexpected kernel > crashes. Keeping existing code and API semantics working correctly > (i.e. regression free) takes precedence over new functionality or > API features that people want to introduce. > > That's all there is to it. This is not a hill you need to die on. If you use GFP_NOFAIL in a context where you're not allowed to sleep, that's a bug, same as any other bug where you get the gfp flags wrong (e.g. GFP_KERNEL in non sleepable context). This isn't going to affect you unless you start going around inserting PF_MEMALLOC_NORECLAIM where it doesn't need to be. Why would you do that? But the lack of gfp flags for pte allocation means that this actually is a serious gap we need to be fixing.