On Thu, Mar 29, 2018 at 09:01:08AM +0200, Michal Hocko wrote: > On Thu 29-03-18 10:57:02, Dave Chinner wrote: > > On Wed, Mar 28, 2018 at 09:01:13AM +0200, Michal Hocko wrote: > > > On Tue 27-03-18 10:13:53, Goldwyn Rodrigues wrote: > > > > > > > > > > > > On 03/27/2018 09:21 AM, Matthew Wilcox wrote: > > > [...] > > > > > Maybe no real filesystem behaves that way. We need feedback from > > > > > filesystem people. > > > > > > > > The idea is to: > > > > * Keep a central location for check, rather than individual filesystem > > > > writepage(). It should reduce code as well. > > > > * Filesystem developers call memory allocations without thinking twice > > > > about which GFP flag to use: GFP_KERNEL or GFP_NOFS. In essence > > > > eliminate GFP_NOFS. > > > > > > I do not think this is the right approach. We do want to eliminate > > > explicit GFP_NOFS usage, but we also want to reduce the overal GFP_NOFS > > > usage as well. The later requires that we drop the __GFP_FS only for > > > those contexts that really might cause reclaim recursion problems. > > > > As I've said before, moving to a scoped API will not reduce the > > number of GFP_NOFS scope allocation points - removing individual > > GFP_NOFS annotations doesn't do anything to avoid the deadlock paths > > it protects against. > > Maybe it doesn't for some filesystems like xfs but I am quite sure it > will for some others which overuse GFP_NOFS just to be sure. E.g. btrfs. > > > The issue is that GFP_NOFS is a big hammer - it stops reclaim from > > all filesystem scopes, not just the one we hold locks on and are > > doing the allocation for. i.e. we can be in one filesystem and quite > > safely do reclaim from other filesystems. The global scope of > > GFP_NOFS just doesn't allow this sort of fine-grained control to be > > expressed in reclaim. > > Agreed! > > > IOWs, if we want to reduce the scope of GFP_NOFS, we need a context > > to be passed from allocation to reclaim so that the reclaim context > > can check that it's a safe allocation context to reclaim from. e.g. > > for GFP_NOFS, we can use the superblock of the allocating filesystem > > as the context, and check it against the superblock that the current > > reclaim context (e.g. shrinker invocation) belongs to. If they > > match, we skip it. If they don't match, then we can perform reclaim > > on that context. > > Agreed again. But this is hardly doable without actually defining what > those scopes are. Once we have them we can expand to add more context. Some filesystems already have well defined scopes (e.g. XFS's transaction scope) - all we need is the infrastructure that passes the scope pointer to reclaim rather than having the allocation code intercept PF_MEMALLOC_NOFS and turn it into GFP_NOFS allocation context... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx