On Thu, Dec 12, 2019 at 11:47 AM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > Hi Olga, > > On Wed, 2019-12-11 at 15:36 -0500, Olga Kornievskaia wrote: > > Hi Trond, > > > > I'd like to raise this once again. Is this true that setting a > > timeout > > limit (TCP_USER_TIMEOUT) is not user configurable (rather I'm pretty > > sure it is not) but my question is why shouldn't it be tied to the > > "timeo" mount option? Right now, only the sesson/lease manager thread > > sets it via rpc_set_connect_timeout() to be lease period related. > > > > Is it the fact that we don't want to allow user to control TCP > > settings via the mount options? But somehow folks are expecting to be > > able to set low "timeo" value and have the (dead) connection to be > > considered dead earlier than for a rather long timeout period which > > is > > happening now. > > In my mind, the two are correlated, but are not equivalent. > > The 'timeo' value is basically a timeout for how long it takes for the > whole process of "send RPC call", "have it processed by the server" and > "receive reply". > IOW: 'timeo' is about how long it takes for an RPC call to execute end- > to-end. Ok, but what happens is there are no actions (connection wise) are taken when this timeout goes off and that' a problem for detecting bad connections. > The TCP_USER_TIMEOUT, is essentially a timeout for how long it takes > the server to ACK receipt of the RPC call once we've placed it in the > TCP socket. > IOW: it is a timeout for the networking part of an RPC call > transmission. But why isn't TCP time out (1) not user configurable and/or (2) not tied to the "timeo" ? > So, as I said, the two are correlated: if the server is down, then your > timeout is dominated by the fact that the network transmission never > completes. However if the server is up and congested, then the > "processing by the server" is likely to dominate. > > The other thing to note is that if the TCP connection is unresponsive, > we may want to fail that much faster in order to give ourselves a > chance to close the connection, open a new one and retransmit the > requests from the old connection before the 'timeo' is triggered (since > in the case of a soft timeout, that could be a fatal error). "we may want to fail" doesn't happen and that's exactly what I would like to happen. Also, TCP timeout is set to the a lease time (let's take linux server which sets 90s timeout) and that's larger than the default "timeo" which is 60s. That goes against your intention to recover in time. > Does that make sense? It's the last case I'm interested in. The issue I'm having is that after a "timeout" (which should be a lease period), the client doesn't sent a SYN trying to establish a new connection. - Here's a current problem. In the cloud environment, a server node goes down. It's spun up again in a different VM (but with the same IP) and server is ready to be receiving requests and continue with the IO. The problem is the client doesn't try to send a new SYN until the old connection timeout. This timeout is 3mins for v3 and can't be shorted because TCP_USER_TIMEOUT isn't user configurable or tied into the timeo. But user expects that connections times out after 60s (as default timeo) (or whatever value timeo is specified during mount). Current linux client doesn't do that. Even in v4, in my testing ,the client doesn't send the new SYN after the lease period (but I believe that's a bug). The only time it does do it if I change rpc_set_connect_time() to something low so that default of 18000 is set. (1) I could be wrong but I think there is a bug that doesn't re-establish connection (unless some low value is set). (2) I think there should be ability (at least for v3) to set the timeout for lower than 3mins. Perhaps we can add a new mount option, either have a totally separate tcp timeout value or something like "sync_nfstcp_timeouts" and use timeo to govern both NFS and TCP timeout. > > > > > Thanks. > > > > On Wed, Oct 3, 2018 at 3:06 PM Olga Kornievskaia <aglo@xxxxxxxxx> > > wrote: > > > On Wed, Oct 3, 2018 at 2:45 PM Trond Myklebust < > > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > > On Wed, 2018-10-03 at 14:31 -0400, Olga Kornievskaia wrote: > > > > > Hi folks, > > > > > > > > > > Is it true that NFS mount option "timeo" has nothing to do with > > > > > the > > > > > socket's setting of the user-specified timeout > > > > > TCP_USER_TIMEOUT. > > > > > Instead, when creating a TCP socket NFS uses either > > > > > default/hard > > > > > coded > > > > > value of 60s for v3 or for v4.x it's lease based. Is there no > > > > > value > > > > > is > > > > > having an adjustable TCP timeout value? > > > > > > > > > > > > > It is adjusted. Please see the calculation in > > > > xs_tcp_set_socket_timeouts(). > > > > > > but it's not user configurable, is it? I don't see a way to modify > > > v3's default 60s TCP timeout. and also in v4, the timeouts are set > > > from xs_tcp_set_connect_timeout() for the lease period but again > > > not > > > user configurable, as far as i can tell. > > > > > > > -- > > > > Trond Myklebust > > > > Linux NFS client maintainer, Hammerspace > > > > trond.myklebust@xxxxxxxxxxxxxxx > > > > > > > > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx > >