Re: Proposal: Use hi-res clock for file timestamps

Neil Brown <neilb@xxxxxxx> · Thu, 19 Aug 2010 09:41:36 +1000

On Wed, 18 Aug 2010 14:15:51 -0400
Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:

> 
> On Aug 18, 2010, at 1:32 PM, J. Bruce Fields wrote:
> 
> > On Wed, Aug 18, 2010 at 03:53:59PM +1000, Neil Brown wrote:
> >> I'm not sure you even want to pay for a per-filesystem atomic access when
> >> updating mtime.  mnt_want_write - called at the same time - seems to go to
> >> some lengths to avoid an atomic operation.
> >> 
> >> I think that nfsd should be the only place that has to pay the atomic
> >> penalty, as it is where the need is.
> >> 
> >> I imagine something like this:
> >> - Create a global struct timespec which is protected by a seqlock
> >>   Call it current_nfsd_time or similar.
> >> - file_update_time reads this and uses it if it is newer than
> >>   current_fs_time.
> >> - nfsd updates it whenever it reads an mtime out of an inode that matches
> >>   current_fs_time to the granularity of 1/HZ.
> > 
> > We can also skip the update whenever current_nfsd_time is greater than
> > the inode's mtime--that's enough to ensure that the next
> > file_update_time() call will get a time different from the inode's
> > current mtime.
> 
> Would it help if we only did this for directories, for now?
> 
> Files have close-to-open.  Directories... don't.  So we have the problem where directory changes (ie file creation and deletion) takes a long time (some times an infinitely long time) to propagate to clients.  Plus: directories don't change very often, so using fine-grained time stamps only on directories wouldn't impact heavy I/O workloads.

I'm don't quite see how close-to-open really affects this issue - it still
relies on the timestamps and so can cache old data if a file update didn't
change the timestamp.

In my mind the difference is that near-concurrent access to files usually
involves file locking which flushes caches (and if it doesn't then you have
bigger problems) while near-concurrent access to directories relies on the
natural atomicity of dir operations so no locking or flushing occurs.

So I agree that this is probably more of an issue for directories than for
files, and that implementing it just for directories would be a sensible
first step with lower expected overhead - just my reasoning seems to be a bit
different.

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html