On Thu, Jan 04, 2024 at 09:17:16PM +0000, Matthew Wilcox wrote: > This is primarily a _FILESYSTEM_ track topic. All the work has already > been done on the MM side; the FS people need to do their part. It could > be a joint session, but I'm not sure there's much for the MM people > to say. > > There are situations where we need to allocate memory, but cannot call > into the filesystem to free memory. Generally this is because we're > holding a lock or we've started a transaction, and attempting to write > out dirty folios to reclaim memory would result in a deadlock. > > The old way to solve this problem is to specify GFP_NOFS when allocating > memory. This conveys little information about what is being protected > against, and so it is hard to know when it might be safe to remove. > It's also a reflex -- many filesystem authors use GFP_NOFS by default > even when they could use GFP_KERNEL because there's no risk of deadlock. > > The new way is to use the scoped APIs -- memalloc_nofs_save() and > memalloc_nofs_restore(). These should be called when we start a > transaction or take a lock that would cause a GFP_KERNEL allocation to > deadlock. Then just use GFP_KERNEL as normal. The memory allocators > can see the nofs situation is in effect and will not call back into > the filesystem. So in rebasing the XFS kmem.[ch] removal patchset I've been working on, there is a clear memory allocator function that we need to be scoped: __GFP_NOFAIL. All of the allocations done through the existing XFS kmem.[ch] interfaces (i.e just about everything) have __GFP_NOFAIL semantics added except in the explicit cases where we add KM_MAYFAIL to indicate that the allocation can fail. The result of this conversion to remove GFP_NOFS is that I'm also adding *dozens* of __GFP_NOFAIL annotations because we effectively scope that behaviour. Hence I think this discussion needs to consider that __GFP_NOFAIL is also widely used within critical filesystem code that cannot gracefully recover from memory allocation failures, and that this would also be useful to scope.... Yeah, I know, mm developers hate __GFP_NOFAIL. We've been using these semantics NOFAIL in XFS for over 2 decades and the sky hasn't fallen. So can we get memalloc_nofail_{save,restore}() so that we can change the default allocation behaviour in certain contexts (e.g. the same contexts we need NOFS allocations) to be NOFAIL unless __GFP_RETRY_MAYFAIL or __GFP_NORETRY are set? We already have memalloc_noreclaim_{save/restore}() for turning off direct memory reclaim for a given context (i.e. equivalent of clearing __GFP_DIRECT_RECLAIM), so if we are going to embrace scoped allocation contexts, then we should be going all in and providing all the contexts that filesystems actually need.... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx