On Wed, 18 Aug 2010 14:15:51 -0400 Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > On Aug 18, 2010, at 1:32 PM, J. Bruce Fields wrote: > > > On Wed, Aug 18, 2010 at 03:53:59PM +1000, Neil Brown wrote: > >> I'm not sure you even want to pay for a per-filesystem atomic access when > >> updating mtime. mnt_want_write - called at the same time - seems to go to > >> some lengths to avoid an atomic operation. > >> > >> I think that nfsd should be the only place that has to pay the atomic > >> penalty, as it is where the need is. > >> > >> I imagine something like this: > >> - Create a global struct timespec which is protected by a seqlock > >> Call it current_nfsd_time or similar. > >> - file_update_time reads this and uses it if it is newer than > >> current_fs_time. > >> - nfsd updates it whenever it reads an mtime out of an inode that matches > >> current_fs_time to the granularity of 1/HZ. > > > > We can also skip the update whenever current_nfsd_time is greater than > > the inode's mtime--that's enough to ensure that the next > > file_update_time() call will get a time different from the inode's > > current mtime. > > Would it help if we only did this for directories, for now? > > Files have close-to-open. Directories... don't. So we have the problem where directory changes (ie file creation and deletion) takes a long time (some times an infinitely long time) to propagate to clients. Plus: directories don't change very often, so using fine-grained time stamps only on directories wouldn't impact heavy I/O workloads. I'm don't quite see how close-to-open really affects this issue - it still relies on the timestamps and so can cache old data if a file update didn't change the timestamp. In my mind the difference is that near-concurrent access to files usually involves file locking which flushes caches (and if it doesn't then you have bigger problems) while near-concurrent access to directories relies on the natural atomicity of dir operations so no locking or flushing occurs. So I agree that this is probably more of an issue for directories than for files, and that implementing it just for directories would be a sensible first step with lower expected overhead - just my reasoning seems to be a bit different. Thanks, NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html