Re: handling ERR_SERVERFAULT on RESTOREFH

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2020-04-30 at 22:05 -0400, Tom Talpey wrote:
> On 4/29/2020 12:22 PM, Olga Kornievskaia wrote:
> > On Wed, Apr 29, 2020 at 11:46 AM J. Bruce Fields <
> > bfields@xxxxxxxxxxxx> wrote:
> > > On Tue, Apr 28, 2020 at 10:12:29PM -0400, Olga Kornievskaia
> > > wrote:
> > > > I also believe that client shouldn't be coded to a broken
> > > > server. But
> > > > in some of those cases, the client is not spec compliant, how
> > > > is that
> > > > a server bug? The case of SERVERFAULT of RESTOREFH I'm not sure
> > > > what
> > > > to make of it. I think it's more of a spec failure to address.
> > > > It
> > > > seems that server isn't allowed to fail after executing a
> > > > non-idempotent operation but that's a hard requirement. I still
> > > > think
> > > > that client's best set of action is to ignore errors on
> > > > RESTOREFH.
> > > 
> > > Maybe.  But how is a server hitting SERVERFAULT on RESTOREFH,
> > > anyway?
> > > That's pretty weird.
> > 
> > An example error is ENOMEM. A server is doing operations to lookup
> > the
> > filehandle (due to it being some other place) and needs to allocate
> > memory. It's possible that resources are currently unavailable.
> > Since
> > RESTOREFH doesn't allow EDELAY, server can only return SERVERFAULT.
> 
> Why does the server need to do that? Surely it can best know how and
> when to reschedule a memory allocation, instead of whining about its
> temporary failure to the client.
> 
> > But as I mentioned before, even if EDELAY was allowed, client only
> > resends the whole compound which is incorrect in case of
> > non-idempotent operations.
> 
> Indeed, that's a protocol imperative, which the client should obey
> by "cracking" the compound to determine what to retry.


RFC5661:

15.1.1.6.  NFS4ERR_SERVERFAULT (Error Code 10006)

   An error occurred on the server that does not map to any of the
   specific legal NFSv4.1 protocol error values.  The client should
   translate this into an appropriate error.  UNIX clients may choose
to
   translate this to EIO.


Which I believe is what we do. As I said, this is a server bug that
needs to be fixed on the server and I see no need to change the client
behaviour for now.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux