On Fri, 2017-11-10 at 17:45 -0500, Chuck Lever wrote: > > On Nov 10, 2017, at 5:37 PM, Joshua Watt <jpewhacker@xxxxxxxxx> > > wrote: > > > > Specifying the serverfailed mount option causes all subsequent RPC > > tasks > > that are queued to fail with -EIO instead of timing out. For > > example, if > > a server has disappeared, the sequence: > > > > mount -o remount,serverfailed > > umount -f > > > > will ensure that all pending I/O requests are cancelled, and any > > new > > requests will also fail. In the event that the server returns, the > > flag > > can be removed with a remount: > > > > mount -o remount,noserverfailed > > > > Although bringing the server back may result in a loss of data > > Hi Joshua, interesting work! > > A couple of things I'm wondering: > > 1. Does this also change the serverfailed setting on submounts > (ie mounts that the kernel did when traversing an NFSv4 referral or > when going from the server's pseudofs into a real fs). These need > to be unmounted first before the parent mount can be unmounted, > and in the latter case they would all be backed by the same stuck > NFS server. I don't honestly know, but I will find out. Our use case doens't deal with either of those cases much. > > 2. If there is a hanging sync(2), does the umount -f unstick it? > That seems relevant for Neil's "make shutdown reliable" use case. > I would like a stuck NFS mount not to result in local file system > corruption, if possible, during a hard shutdown. > Several previous posts have aluded to calling "umount -f" repeatedly to get a "stuck" NFS mount to actually unmount. This patch set is effectively a way of automating that process. More formally, the sequence of commands: mount PATH -o remount,serverfailed umount -f PATH Is closely approximated by: while(1) umount2(PATH, MNT_FORCE); from userspace. In retrospect, I think that the "umount -f" shouldn't be required after remount: If the serverfailed mount option is specified it should also kill all pending RPCs. A umount -f is undesirable, because you may not actually want to (potentially) unmount the file system just to cancel the RPCs. More directly to your question: I don't honestly know if this can "unstick" a hanging sync(2). If repeatedly calling umount -f will unstick it, then these patches will exhibit the same behavior. I will continue to do some research on this, but if anyone knows the answer they might be able to chime in. Could you clarify on what you mean by a stuck NFS mount not resulting in local filesystem corruption? I'm not sure I undertsand well enough to follow that comment. Thanks for the feedback > > > Signed-off-by: Joshua Watt <JPEWhacker@xxxxxxxxx> > > --- > > fs/nfs/inode.c | 6 ++++++ > > fs/nfs/super.c | 30 +++++++++++++++++++++++++----- > > include/uapi/linux/nfs_mount.h | 1 + > > 3 files changed, 32 insertions(+), 5 deletions(-) > > > > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c > > index 134d9f560240..55b7cd40508d 100644 > > --- a/fs/nfs/inode.c > > +++ b/fs/nfs/inode.c > > @@ -723,6 +723,12 @@ static void > > nfs_readdirplus_parent_cache_hit(struct dentry *dentry) > > > > static bool nfs_need_revalidate_inode(struct inode *inode) > > { > > + /* If the server has failed, it is not going to respond, > > so don't try > > + * and revalidated (otherwise, the serverfailed flag can't > > be cleared by > > + * a remount) > > + */ > > + if (NFS_SERVER(inode)->flags & NFS_MOUNT_SERVERFAILED) > > + return false; > > if (NFS_I(inode)->cache_validity & > > (NFS_INO_INVALID_ATTR|NFS_INO_INVALID_LABEL)) > > return true; > > diff --git a/fs/nfs/super.c b/fs/nfs/super.c > > index 777a0cc22704..bca38e1cdd85 100644 > > --- a/fs/nfs/super.c > > +++ b/fs/nfs/super.c > > @@ -90,6 +90,7 @@ enum { > > Opt_resvport, Opt_noresvport, > > Opt_fscache, Opt_nofscache, > > Opt_migration, Opt_nomigration, > > + Opt_serverfailed, Opt_noserverfailed, > > > > /* Mount options that take integer arguments */ > > Opt_port, > > @@ -151,6 +152,8 @@ static const match_table_t > > nfs_mount_option_tokens = { > > { Opt_nofscache, "nofsc" }, > > { Opt_migration, "migration" }, > > { Opt_nomigration, "nomigration" }, > > + { Opt_serverfailed, "serverfailed" }, > > + { Opt_noserverfailed, "noserverfailed" }, > > > > { Opt_port, "port=%s" }, > > { Opt_rsize, "rsize=%s" }, > > @@ -637,6 +640,7 @@ static void nfs_show_mount_options(struct > > seq_file *m, struct nfs_server *nfss, > > { NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" }, > > { NFS_MOUNT_UNSHARED, ",nosharecache", "" }, > > { NFS_MOUNT_NORESVPORT, ",noresvport", "" }, > > + { NFS_MOUNT_SERVERFAILED, ",serverfailed", "" }, > > { 0, NULL, NULL } > > }; > > const struct proc_nfs_info *nfs_infop; > > @@ -1330,6 +1334,11 @@ static int nfs_parse_mount_options(char > > *raw, > > case Opt_nomigration: > > mnt->options &= ~NFS_OPTION_MIGRATION; > > break; > > + case Opt_serverfailed: > > + case Opt_noserverfailed: > > + set_flag(mnt, NFS_MOUNT_SERVERFAILED, > > + token == Opt_serverfailed); > > + break; > > > > /* > > * options that take numeric values > > @@ -2192,6 +2201,8 @@ static int nfs_validate_text_mount_data(void > > *options, > > return -EINVAL; > > } > > > > +#define NFS_REMOUNT_CHANGE_FLAGS (NFS_MOUNT_SERVERFAILED) > > + > > #define NFS_REMOUNT_CMP_FLAGMASK ~(NFS_MOUNT_INTR \ > > | NFS_MOUNT_SECURE \ > > | NFS_MOUNT_TCP \ > > @@ -2200,7 +2211,8 @@ static int nfs_validate_text_mount_data(void > > *options, > > | NFS_MOUNT_NONLM \ > > | NFS_MOUNT_BROKEN_SUID \ > > | NFS_MOUNT_STRICTLOCK \ > > - | NFS_MOUNT_LEGACY_INTERFACE) > > + | NFS_MOUNT_LEGACY_INTERFACE \ > > + | NFS_REMOUNT_CHANGE_FLAGS) > > > > #define NFS_MOUNT_CMP_FLAGMASK (NFS_REMOUNT_CMP_FLAGMASK & \ > > ~(NFS_MOUNT_UNSHARED | NFS_MOUNT_NORESVPORT)) > > @@ -2209,12 +2221,15 @@ static int > > nfs_compare_and_set_remount_data(struct nfs_server *nfss, > > struct nfs_parsed_mount_data *data) > > { > > + int changed_flags_mask = data->flags_mask & > > NFS_REMOUNT_CHANGE_FLAGS; > > + struct rpc_clnt *cl = nfss->client; > > + > > if ((data->flags ^ nfss->flags) & NFS_REMOUNT_CMP_FLAGMASK || > > data->rsize != nfss->rsize || > > data->wsize != nfss->wsize || > > data->version != nfss->nfs_client->rpc_ops->version || > > data->minorversion != nfss->nfs_client->cl_minorversion || > > - !nfs_auth_info_match(&data->auth_info, nfss->client- > > >cl_auth->au_flavor) || > > + !nfs_auth_info_match(&data->auth_info, cl->cl_auth- > > >au_flavor) || > > data->acregmin != nfss->acregmin / HZ || > > data->acregmax != nfss->acregmax / HZ || > > data->acdirmin != nfss->acdirmin / HZ || > > @@ -2225,13 +2240,16 @@ nfs_compare_and_set_remount_data(struct > > nfs_server *nfss, > > (struct sockaddr *)&nfss->nfs_client- > > >cl_addr)) > > return -EINVAL; > > > > - if (data->retrans != nfss->client->cl_timeout->to_retries > > || > > - data->timeo != (10U * nfss->client->cl_timeout- > > >to_initval / HZ)) { > > + /* Update flags */ > > + nfss->flags = (nfss->flags & ~changed_flags_mask) | > > + (data->flags & changed_flags_mask); > > + > > + if (data->retrans != cl->cl_timeout->to_retries || > > + data->timeo != (10U * cl->cl_timeout->to_initval / > > HZ)) { > > /* Note that this will affect all mounts from the same > > server, > > * that use the same protocol. The timeouts are always > > forced > > * to be the same. > > */ > > - struct rpc_clnt *cl = nfss->client; > > if (cl->cl_timeout != &cl->cl_timeout_default) > > memcpy(&cl->cl_timeout_default, cl->cl_timeout, > > sizeof(struct rpc_timeout)); > > @@ -2239,6 +2257,8 @@ nfs_compare_and_set_remount_data(struct > > nfs_server *nfss, > > cl->cl_timeout_default.to_initval = data->timeo * HZ / > > 10U; > > } > > > > + cl->cl_kill_new_tasks = !!(nfss->flags & > > NFS_MOUNT_SERVERFAILED); > > + > > return 0; > > } > > > > diff --git a/include/uapi/linux/nfs_mount.h > > b/include/uapi/linux/nfs_mount.h > > index e44e00616ab5..8ad931cdca20 100644 > > --- a/include/uapi/linux/nfs_mount.h > > +++ b/include/uapi/linux/nfs_mount.h > > @@ -74,5 +74,6 @@ struct nfs_mount_data { > > > > #define NFS_MOUNT_LOCAL_FLOCK 0x100000 > > #define NFS_MOUNT_LOCAL_FCNTL 0x200000 > > +#define NFS_MOUNT_SERVERFAILED 0x400000 > > > > #endif > > -- > > 2.13.6 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux- > > nfs" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Chuck Lever > > > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html