On Thu 29-03-18 10:57:02, Dave Chinner wrote: > On Wed, Mar 28, 2018 at 09:01:13AM +0200, Michal Hocko wrote: > > On Tue 27-03-18 10:13:53, Goldwyn Rodrigues wrote: > > > > > > > > > On 03/27/2018 09:21 AM, Matthew Wilcox wrote: > > [...] > > > > Maybe no real filesystem behaves that way. We need feedback from > > > > filesystem people. > > > > > > The idea is to: > > > * Keep a central location for check, rather than individual filesystem > > > writepage(). It should reduce code as well. > > > * Filesystem developers call memory allocations without thinking twice > > > about which GFP flag to use: GFP_KERNEL or GFP_NOFS. In essence > > > eliminate GFP_NOFS. > > > > I do not think this is the right approach. We do want to eliminate > > explicit GFP_NOFS usage, but we also want to reduce the overal GFP_NOFS > > usage as well. The later requires that we drop the __GFP_FS only for > > those contexts that really might cause reclaim recursion problems. > > As I've said before, moving to a scoped API will not reduce the > number of GFP_NOFS scope allocation points - removing individual > GFP_NOFS annotations doesn't do anything to avoid the deadlock paths > it protects against. Maybe it doesn't for some filesystems like xfs but I am quite sure it will for some others which overuse GFP_NOFS just to be sure. E.g. btrfs. > The issue is that GFP_NOFS is a big hammer - it stops reclaim from > all filesystem scopes, not just the one we hold locks on and are > doing the allocation for. i.e. we can be in one filesystem and quite > safely do reclaim from other filesystems. The global scope of > GFP_NOFS just doesn't allow this sort of fine-grained control to be > expressed in reclaim. Agreed! > IOWs, if we want to reduce the scope of GFP_NOFS, we need a context > to be passed from allocation to reclaim so that the reclaim context > can check that it's a safe allocation context to reclaim from. e.g. > for GFP_NOFS, we can use the superblock of the allocating filesystem > as the context, and check it against the superblock that the current > reclaim context (e.g. shrinker invocation) belongs to. If they > match, we skip it. If they don't match, then we can perform reclaim > on that context. Agreed again. But this is hardly doable without actually defining what those scopes are. Once we have them we can expand to add more context. -- Michal Hocko SUSE Labs