When an i_op->getattr() call is made on an NFS file (typically from a 'stat' family system call), NFS will first flush any dirty data to the server. This ensures that the mtime reported is correct and stable, but has a performance penalty. 'stat' is normally thought to be a quick operation, and imposing this cost can be surprising. I have seen problems when one process is writing a large file and another process performs "ls -l" on the containing directory and is blocked for as long as it take to flush all the dirty data to the server, which can be minutes. I have also seen a legacy application which frequently calls "fstat" on a file that it is writing to. On a local filesystem (and in the Solaris implementation of NFS) this fstat call is cheap. On Linux/NFS, the causes a noticeable decrease in throughput. The only circumstances where an application calling 'stat()' might get an mtime which is not stable are times when some other process is writing to the file and the two processes are not using locking to ensure consistency, or when the one process is both writing and stating. In neither of these cases is it reasonable to expect the mtime to be stable. In the most common cases where mtime is important (e.g. make), no other process has the file open, so there will be no dirty data and the mtime will be stable. Rather than unilaterally changing this behavior of 'stat', this patch adds a "nosyncflush" mount option to allow sysadmins to have applications which are hurt by the current behavior to disable it. Note that this option should probably *not* be used together with "nocto". In that case, mtime could be unstable even when no process has the file open. Signed-off-by: NeilBrown <neilb@xxxxxxxx> --- fs/nfs/inode.c | 3 ++- fs/nfs/super.c | 10 ++++++++++ include/uapi/linux/nfs_mount.h | 6 ++++-- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index b992d2382ffa..16629a34dd62 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -740,7 +740,8 @@ int nfs_getattr(const struct path *path, struct kstat *stat, trace_nfs_getattr_enter(inode); /* Flush out writes to the server in order to update c/mtime. */ - if (S_ISREG(inode->i_mode)) { + if (S_ISREG(inode->i_mode) && + !(NFS_SERVER(inode)->flags & NFS_MOUNT_NOSTATFLUSH)) { err = filemap_write_and_wait(inode->i_mapping); if (err) goto out; diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 29bacdc56f6a..2351c0be98f5 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -90,6 +90,7 @@ enum { Opt_resvport, Opt_noresvport, Opt_fscache, Opt_nofscache, Opt_migration, Opt_nomigration, + Opt_statflush, Opt_nostatflush, /* Mount options that take integer arguments */ Opt_port, @@ -151,6 +152,8 @@ static const match_table_t nfs_mount_option_tokens = { { Opt_nofscache, "nofsc" }, { Opt_migration, "migration" }, { Opt_nomigration, "nomigration" }, + { Opt_statflush, "statflush" }, + { Opt_nostatflush, "nostatflush" }, { Opt_port, "port=%s" }, { Opt_rsize, "rsize=%s" }, @@ -637,6 +640,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss, { NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" }, { NFS_MOUNT_UNSHARED, ",nosharecache", "" }, { NFS_MOUNT_NORESVPORT, ",noresvport", "" }, + { NFS_MOUNT_NOSTATFLUSH, ",nostatflush", "" }, { 0, NULL, NULL } }; const struct proc_nfs_info *nfs_infop; @@ -1334,6 +1338,12 @@ static int nfs_parse_mount_options(char *raw, case Opt_nomigration: mnt->options &= ~NFS_OPTION_MIGRATION; break; + case Opt_statflush: + mnt->flags &= ~NFS_MOUNT_NOSTATFLUSH; + break; + case Opt_nostatflush: + mnt->flags |= NFS_MOUNT_NOSTATFLUSH; + break; /* * options that take numeric values diff --git a/include/uapi/linux/nfs_mount.h b/include/uapi/linux/nfs_mount.h index e44e00616ab5..d7c6f809d25d 100644 --- a/include/uapi/linux/nfs_mount.h +++ b/include/uapi/linux/nfs_mount.h @@ -72,7 +72,9 @@ struct nfs_mount_data { #define NFS_MOUNT_NORESVPORT 0x40000 #define NFS_MOUNT_LEGACY_INTERFACE 0x80000 -#define NFS_MOUNT_LOCAL_FLOCK 0x100000 -#define NFS_MOUNT_LOCAL_FCNTL 0x200000 +#define NFS_MOUNT_LOCAL_FLOCK 0x100000 +#define NFS_MOUNT_LOCAL_FCNTL 0x200000 + +#define NFS_MOUNT_NOSTATFLUSH 0x400000 #endif -- 2.14.0.rc0.dirty
Attachment:
signature.asc
Description: PGP signature