> From: Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> > Sent: Tuesday, March 14, 2023 2:40 PM > > > On Mar 13, 2023, at 11:17, Andrew Klaassen > <andrew.klaassen@xxxxxxxxxxxxxx> wrote: > > > > We are using applications which hang if any NFS servers fail to > > respond. We would like to be able to control NFS timeouts so that we > > can control the maximum time that the applications hang. We currently > > can't do that with TCP NFS mounts, since RPC calls made to an existing > > NFS mount are first subject to the default untuneable Sun RPC timeout > > of 2 minutes. > > > > (I'll note that the existing NFS manpage seems to not describe current > > behaviour correctly, since it says that this two-minute timeout > > applies to initial mount operations (which it does not), and does not > > say that the two-minute timeout applies to operations on existing > > mounts (which it does).) > > > > An existing thread discussing this patch can be found here: > > > > Link: > > https://lore.kernel.org/linux-nfs/45e2e7f05a13abab777b3b0868744cdbfc62 > > 3f2d.camel@xxxxxxxxxx/T/ > > > > This patch uses the RPC call timeout to set the xprt timeout. In that > > discussion thread, Jeff Layton has pointed out that this may or may > > not be the ideal approach. I have suggested these alternatives, and > > would be happy to get feedback: > > > > - Create system-wide tuneables for xs_[local|udp|tcp]_default_timeout. > > In our case that's less-than-ideal, since we want to change the total > > timeout for an NFS mount on a per-server or per-mount basis rather > > than a system-wide basis, but it would do in a pinch. > > > > - Add a second set of timeout options to NFS so that RPC call and xprt > > timeouts can be specified separately. I'm guessing no-one is > > enthusiastic about option bloat, even if this would be the > > theoretically cleanest option. I'm guessing this would also involve > > changing the Sun RPC API and everything that calls it in order for it > > to accept the second set of timeout options. > > > > - Use timeo and retrans for the RPC call timeout, and retry for the > > xprt timeout. Or do the opposite. The NFS manpage describes the > > current behaviour incorrectly, so this at least wouldn't make the > > documentation any worse. I assume this would also involve changing the > Sun RPC API. > > > > Use rpc_create_args->timeout to initialize rpc_xprt->timeout > > > > Just because something can be done in the kernel, it doesn’t mean that it > should be done in the kernel. If you’re unhappy with sunrpc timeouts, then it > should be quite possible to do those calls in userspace, and pass the port > number down as part of the mount syscall. Thanks for the direction, Trond. I'll spend some time getting familiar with the code and see if I can make that happen. I'm currently clueless about how to get started, as there doesn't appear to be any way to override sunrpc timeout defaults for any sunrpc call, so I may have some followup questions once I get my head wrapped around the mount code. Andrew