On Fri, Apr 08, 2022 at 12:46:08PM +1000, NeilBrown wrote: > On Thu, 07 Apr 2022, Trond Myklebust wrote: > > The bottom line is that we use ordinary GFP_KERNEL memory allocations > > where we can. The new code follows that rule, breaking it only in cases > > where the specific rules of rpciod/xprtiod/nfsiod make it impossible to > > wait forever in the memory manager. > > It is not safe to use GFP_KERNEL for an allocation that is needed in > order to free memory - and so any allocation that is needed to write out > data from the page cache. Except that same page cache writeback path can be called from syscall context (e.g. fsync()) which has nothing to do with memory reclaim. In that case GFP_KERNEL is the correct allocation context to use because there are no constraints on what memory reclaim can be performed from this path. IOWs, if the context initiating data writeback doesn't allow GFP_KERNEL allocations, then it should be calling memalloc_nofs_save() or memalloc_noio_save() to constrain all allocations to the required context. We should not be requiring the filesystem (or any other subsystem) to magically infer that the IO is being done in a constrained allocation context and modify the context they use appropriately. If we this, then all filesystems would simply use GFP_NOIO everywhere because the loop device layers the entire filesystem IO path under block device context (i.e. requiring GFP_NOIO allocation context). We don't do this - the loop device sets PF_MEMALLOC_NOIO instead so all allocations in that path run with at least GFP_NOIO constraints and filesystems are none the wiser about the constraints of the calling context. IOWs, GFP_KERNEL is generally right context to be using in filesystem IO paths and callers need to restrict allocation contexts via task flags if they cannot allow certain types of reclaim recursion to occur... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx