Re: NFS client pNFS handling of NFS4ERR_NOSPC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Trond Myklebust wrote:
> On Mon, 2021-11-08 at 02:27 +0000, Rick Macklem wrote:
> > Trond Myklebust wrote:
> > > On Sun, 2021-11-07 at 00:03 +0000, Rick Macklem wrote:
> > > > Hi,
> > > >
> > > > I ran a simple test using a Linux 5.12 client NFSv4.2 mount
> > > > against a FreeBSD pNFS server, where the DS is out of space
> > > > (intentionally, by creating a large file on it).
> > > >
> > > > I tried to write a file on the Linux NFS client mount and the
> > > > mount point gets "stuck" (will not <ctrl>C nor "umount -f").
> > > > --> The client is attempting writes against the DS repeatedly,
> > > >        with the DS replying NFS4ERR_NOSPC. (Same byte offsets,
> > > >        over and over and over again.)
> > > > --> The client is repeatedly sending RPCs with LayoutError in
> > > >        them to the MDS, reporting the NFS4ERR_NOSPC.
> > > >
> > > > I'll leave it up to others, but failing the program trying to
> > > > write the file with ENOSPC would seem preferable to the
> > > > "stuck" mount?
> > > > --> Removing the large file from the DS so that the Writes
> > > >       can succeed does cause the client to recover.
> > > >
> > >
> > > The client expectation is that the MDS will either remedy the
> > > situation, or it will return an appropriate application-level error
> > > to
> > > the LAYOUTGET.
> > Thanks Trond, that worked fine for NFSv4.2. I tweaked the pNFS server
> > to reply NFS4ERR_NOSPC to LayoutGet and that worked fine.
> > (This is triggered by the LayoutError.)
> >
> > For NFSv4.1, things don't work as well, since there is no LayoutError
> > operation. The LayoutReturn has the NFS4ERR_NOSPC error in it,
> > but that doesn't happen until it finishes (which doesn't happen until
> > I free up space on the DS).
>
> Hmm... The ENOSPC error from the DS should in principle be marking the
> layout for return. You're saying that the return isn't happening?
Not until the end, after I have deleted the large file, so there is space on the
DS for the writes. It is in the same compound as Close.
The packet capture is here, in case you are interested:
https://people.freebsd.org/~rmacklem/linux-ds-out-of-space.pcap
(Taken at the MDS, so it doesn't show the DS RPCs, but they're just
 a lot of writes that fail with NFS4ERR_NOSPC until near the end.)

If you look, you'll see it gets a layout for the entire file first,
then it repeatedly does LayoutGets that are a little weird.
- For 4K only, but always on for an offset that is an exact multiple
   of 1Mbyte.
--> Then, once I free up space on the DS, it does the compound
      that includes both Close and LayoutReturn (which has the
      NFS4ERR_NOSPC error report in it).

> Does a newer client fix the issue?
This was 5.12. I'll build/test a newer kernel in the next couple of
days and report back (it's an old single core i386, so it takes a while;-).

rick

> But I can live with only 4.2 working well. I can't be bothered
> endlessly
> probing the DSs to see if they are out of space.

Agreed. Your server should be able to rely on the layout error reports
from the client (either in LAYOUTERROR or in the LAYOUTRETURN) in order
to figure out when the DS might be out of space.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux