Re: A NFS, xfs, reflink and rmapbt story

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 28 Jan 2020 10:56:17 +1100

On Thu, Jan 23, 2020 at 05:10:19PM -0800, Darrick J. Wong wrote:
> On Thu, Jan 23, 2020 at 04:32:17PM +0800, Murphy Zhou wrote:
> > Hi,
> > 
> > Deleting the files left by generic/175 costs too much time when testing
> > on NFSv4.2 exporting xfs with rmapbt=1.
> > 
> > "./check -nfs generic/175 generic/176" should reproduce it.
> > 
> > My test bed is a 16c8G vm.
> 
> What kind of storage?

Is the NFS server the same machine as what the local XFS tests were
run on?

> > NFSv4.2  rmapbt=1   24h+
> 
> <URK> Wow.  I wonder what about NFS makes us so slow now?  Synchronous
> transactions on the inactivation?  (speculates wildly at the end of the
> workday)

Doubt it - NFS server uses ->commit_metadata after the async
operation to ensure that it is completed and on stable storage, so
the truncate on inactivation should run at pretty much the same
speed as on a local filesystem as it's still all async commits. i.e.
the only difference on the NFS server is the log force that follows
the inode inactivation...

> I'll have a look in the morning.  It might take me a while to remember
> how to set up NFS42 :)
> 
> --D
> 
> > NFSv4.2  rmapbt=0   1h-2h
> > xfs      rmapbt=1   10m+
> > 
> > At first I thought it hung, turns out it was just slow when deleting
> > 2 massive reflined files.

Both tests run on the scratch device, so I don't see where there is
a large file unlink in either of these tests.

In which case, I'd expect that all the time is consumed in
generic/176 running punch_alternating to create a million extents
as that will effectively run a synchronous server-side hole punch
half a million times.

However, I'm guessing that the server side filesystem has a very
small log and is on spinning rust, hence the ->commit_metadata log
forces are preventing in-memory aggregation of modifications. This
results in the working set of metadata not fitting in the log and so
each new hole punch transaction ends up waiting on log tail pushing
(i.e. metadata writeback IO).  i.e. it's thrashing the disk, and
that's why it is slow.....

Storage details, please!

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx