Re: [PATCH 1/3] fs: Perform writebacks under memalloc_nofs

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 29 Mar 2018 10:57:02 +1100

On Wed, Mar 28, 2018 at 09:01:13AM +0200, Michal Hocko wrote:
> On Tue 27-03-18 10:13:53, Goldwyn Rodrigues wrote:
> > 
> > 
> > On 03/27/2018 09:21 AM, Matthew Wilcox wrote:
> [...]
> > > Maybe no real filesystem behaves that way.  We need feedback from
> > > filesystem people.
> > 
> > The idea is to:
> > * Keep a central location for check, rather than individual filesystem
> > writepage(). It should reduce code as well.
> > * Filesystem developers call memory allocations without thinking twice
> > about which GFP flag to use: GFP_KERNEL or GFP_NOFS. In essence
> > eliminate GFP_NOFS.
> 
> I do not think this is the right approach. We do want to eliminate
> explicit GFP_NOFS usage, but we also want to reduce the overal GFP_NOFS
> usage as well. The later requires that we drop the __GFP_FS only for
> those contexts that really might cause reclaim recursion problems.

As I've said before, moving to a scoped API will not reduce the
number of GFP_NOFS scope allocation points - removing individual
GFP_NOFS annotations doesn't do anything to avoid the deadlock paths
it protects against.

The issue is that GFP_NOFS is a big hammer - it stops reclaim from
all filesystem scopes, not just the one we hold locks on and are
doing the allocation for. i.e. we can be in one filesystem and quite
safely do reclaim from other filesystems. The global scope of
GFP_NOFS just doesn't allow this sort of fine-grained control to be
expressed in reclaim.

IOWs, if we want to reduce the scope of GFP_NOFS, we need a context
to be passed from allocation to reclaim so that the reclaim context
can check that it's a safe allocation context to reclaim from. e.g.
for GFP_NOFS, we can use the superblock of the allocating filesystem
as the context, and check it against the superblock that the current
reclaim context (e.g. shrinker invocation) belongs to. If they
match, we skip it. If they don't match, then we can perform reclaim
on that context.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx