> From: Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> > Sent: Tuesday, March 14, 2023 2:40 PM > > > On Mar 13, 2023, at 11:17, Andrew Klaassen > <andrew.klaassen@xxxxxxxxxxxxxx> wrote: > > > > We are using applications which hang if any NFS servers fail to > > respond. We would like to be able to control NFS timeouts so that we > > can control the maximum time that the applications hang. We currently > > can't do that with TCP NFS mounts, since RPC calls made to an existing > > NFS mount are first subject to the default untuneable Sun RPC timeout > > of 2 minutes. > > > > (I'll note that the existing NFS manpage seems to not describe current > > behaviour correctly, since it says that this two-minute timeout > > applies to initial mount operations (which it does not), and does not > > say that the two-minute timeout applies to operations on existing > > mounts (which it does).) > > > > An existing thread discussing this patch can be found here: > > > > Link: > > https://lore.kernel.org/linux- > nfs/45e2e7f05a13abab777b3b0868744cdbfc62 > > 3f2d.camel@xxxxxxxxxx/T/ > > > > This patch uses the RPC call timeout to set the xprt timeout. In that > > discussion thread, Jeff Layton has pointed out that this may or may > > not be the ideal approach. I have suggested these alternatives, and > > would be happy to get feedback: > > > > - Create system-wide tuneables for xs_[local|udp|tcp]_default_timeout. > > In our case that's less-than-ideal, since we want to change the total > > timeout for an NFS mount on a per-server or per-mount basis rather > > than a system-wide basis, but it would do in a pinch. > > > > - Add a second set of timeout options to NFS so that RPC call and xprt > > timeouts can be specified separately. I'm guessing no-one is > > enthusiastic about option bloat, even if this would be the > > theoretically cleanest option. I'm guessing this would also involve > > changing the Sun RPC API and everything that calls it in order for it > > to accept the second set of timeout options. > > > > - Use timeo and retrans for the RPC call timeout, and retry for the > > xprt timeout. Or do the opposite. The NFS manpage describes the > > current behaviour incorrectly, so this at least wouldn't make the > > documentation any worse. I assume this would also involve changing the > Sun RPC API. > > > > Use rpc_create_args->timeout to initialize rpc_xprt->timeout > > Just because something can be done in the kernel, it doesn’t mean that it > should be done in the kernel. If you’re unhappy with sunrpc timeouts, then it > should be quite possible to do those calls in userspace, and pass the port > number down as part of the mount syscall. An update, since I got an email from someone inquiring about this problem and my patch: I worked on this for a little while to try to figure out how to do what you suggested, but I didn't find any way to pass an existing port number to the mount syscall in a way that would preserve the timeout options on the userspace sunrpc call. If you have a quick proof-of-concept I could try to take it from there. Unfortunately my company has moved me on to other projects and I'm not able to dedicate much time to it. Andrew