On Wed, Mar 28, 2018 at 09:01:13AM +0200, Michal Hocko wrote: > On Tue 27-03-18 10:13:53, Goldwyn Rodrigues wrote: > > > > > > On 03/27/2018 09:21 AM, Matthew Wilcox wrote: > [...] > > > Maybe no real filesystem behaves that way. We need feedback from > > > filesystem people. > > > > The idea is to: > > * Keep a central location for check, rather than individual filesystem > > writepage(). It should reduce code as well. > > * Filesystem developers call memory allocations without thinking twice > > about which GFP flag to use: GFP_KERNEL or GFP_NOFS. In essence > > eliminate GFP_NOFS. > > I do not think this is the right approach. We do want to eliminate > explicit GFP_NOFS usage, but we also want to reduce the overal GFP_NOFS > usage as well. The later requires that we drop the __GFP_FS only for > those contexts that really might cause reclaim recursion problems. As I've said before, moving to a scoped API will not reduce the number of GFP_NOFS scope allocation points - removing individual GFP_NOFS annotations doesn't do anything to avoid the deadlock paths it protects against. The issue is that GFP_NOFS is a big hammer - it stops reclaim from all filesystem scopes, not just the one we hold locks on and are doing the allocation for. i.e. we can be in one filesystem and quite safely do reclaim from other filesystems. The global scope of GFP_NOFS just doesn't allow this sort of fine-grained control to be expressed in reclaim. IOWs, if we want to reduce the scope of GFP_NOFS, we need a context to be passed from allocation to reclaim so that the reclaim context can check that it's a safe allocation context to reclaim from. e.g. for GFP_NOFS, we can use the superblock of the allocating filesystem as the context, and check it against the superblock that the current reclaim context (e.g. shrinker invocation) belongs to. If they match, we skip it. If they don't match, then we can perform reclaim on that context. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx