Re: [PATCH 3/5] nfsd: Only set PF_LESS_THROTTLE when really needed.

NeilBrown <neilb@xxxxxxx> · Mon, 12 May 2014 11:04:37 +1000

On Tue, 06 May 2014 17:05:01 -0400 Rik van Riel <riel@xxxxxxxxxx> wrote:

> On 04/22/2014 10:40 PM, NeilBrown wrote:
> > PF_LESS_THROTTLE has a very specific use case: to avoid deadlocks
> > and live-locks while writing to the page cache in a loop-back
> > NFS mount situation.
> > 
> > It therefore makes sense to *only* set PF_LESS_THROTTLE in this
> > situation.
> > We now know when a request came from the local-host so it could be a
> > loop-back mount.  We already know when we are handling write requests,
> > and when we are doing anything else.
> > 
> > So combine those two to allow nfsd to still be throttled (like any
> > other process) in every situation except when it is known to be
> > problematic.
> 
> The FUSE code has something similar, but on the "client"
> side.
> 
> See BDI_CAP_STRICTLIMIT in mm/writeback.c
> 
> Would it make sense to use that flag on loopback-mounted
> NFS filesystems?
> 

I don't think so.

I don't fully understand BDI_CAP_STRICTLIMIT, but it seems to be very
fuse-specific and relates to NR_WRITEBACK_TEMP, which only fuse uses.  NFS
doesn't need any 'strict' limits.
i.e. it looks like fuse-specific code inside core-vm code, which I would
rather steer clear of.

Setting a bdi flag for a loopback-mounted NFS filesystem isn't really
possible because it "is it loopback mounted" state is fluid.  IP addresses can
be migrated (for HA cluster failover) and what was originally a remote-NFS
mount can become a loopback NFS mount (and that is exactly the case I need to
deal with).

So we can only really assess "is it loop-back" on a per-request basis.

This patch does that assessment in nfsd to limit the use of PF_LESS_THROTTLE.
Another patch does it in nfs to limit the waiting in nfs_release_page.

Thanks,
NeilBrown
Attachment:
signature.asc

Description: PGP signature