Re: [PATCH v2 00/28] Fix up soft mounts for NFSv4.x

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Wed, 3 Apr 2019 21:13:37 +0000

On Wed, 2019-04-03 at 22:51 +0200, Mkrtchyan, Tigran wrote:
> Hi Trond,
> 
> ----- Original Message -----
> > From: "Trond Myklebust" <trondmy@xxxxxxxxx>
> > To: "Olga Kornievskaia" <aglo@xxxxxxxxx>
> > Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx>
> > Sent: Tuesday, April 2, 2019 8:28:38 PM
> > Subject: Re: [PATCH v2 00/28] Fix up soft mounts for NFSv4.x
> > On Mon, 2019-04-01 at 12:54 -0400, Olga Kornievskaia wrote:
> > > On Fri, Mar 29, 2019 at 6:02 PM Trond Myklebust <
> > > trondmy@xxxxxxxxx>
> > > wrote:
> > > > This patchset aims to make soft mounts a viable option for
> > > > NFSv4
> > > > clients
> > > > by minimising the risk of false positive timeouts, while
> > > > allowing
> > > > for
> > > > faster failover of reads and writes once a timeout is actually
> > > > observed.
> > > > 
> > > > The patches rely on the NFS server correctly implementing the
> > > > contract
> > > > specified in RFC7530 section 3.1.1 with respect to not dropping
> > > > requests
> > > > while the transport connection is up. When this is the case,
> > > > the
> > > > client
> > > > can safely assume that if the request has not received a reply
> > > > after
> > > > transmitting a RPC request, it is not because the request was
> > > > dropped,
> > > > but rather is due to congestion, or slow processing on the
> > > > server.
> > > > IOW: as long as the connection remains up, there is no need for
> > > > requests
> > > > to time out.
> > > > 
> > > > The patches break down roughly as follows:
> > > > - A set of patches to clean up the RPC engine timeouts, and
> > > > ensure
> > > > they
> > > >   are accurate.
> > > > - A set of patches to change the 'soft' mount semantics for
> > > > NFSv4.x.
> > > > - A set of patches to add a new 'softerr' mount option that
> > > > works
> > > > like
> > > >   soft, but explicitly signals timeouts using the ETIMEDOUT
> > > > error
> > > > code
> > > >   rather than using EIO. This allows applications to tune their
> > > >   behaviour (e.g. by failing over to a different server) if a
> > > > timeout
> > > >   occurs.
> > > 
> > > I'm just curious why would an application be aware of a different
> > > server to connect to and an NFS layer would not be? I'm also
> > > curious
> > > wouldn't it break application that typically expect to get an EIO
> > > errors? Do all system calls allow to return ETIMEDOUT error?
> > 
> > This is why it is a separate mount option. ...and actually most
> > applications blow up when they get EIO as well. However you can
> > imagine
> > an application that might decide to retry if it hits an ETIMEDOUT,
> > while failing if it hits an EIO.
> 
> What is the reason of introducing new error code for IO operations,
> which
> is not in the list of POSIX specified values for read(2) and
> write(2). Is
> there expected application behavior change compared to EAGAIN?

The point is to allow aware applications to better handle a situation
which is not covered by POSIX because POSIX has no concept of storage
that is temporarily unavailable.

...and it is being proposed as an opt-in feature, precisely so that
existing applications don't need to change.

> I would like to use the opportunity to bring the topic of O_NONBLOCK
> open(2)
> flag for offline files.

-- 
Trond Myklebust
CTO, Hammerspace Inc
4300 El Camino Real, Suite 105
Los Altos, CA 94022
www.hammer.space