Re: NFS client pNFS handling of NFS4ERR_NOSPC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Rick Macklem wrote:
> Trond Myklebust wrote:
> > On Mon, 2021-11-08 at 02:27 +0000, Rick Macklem wrote:
> > > Trond Myklebust wrote:
> > > > On Sun, 2021-11-07 at 00:03 +0000, Rick Macklem wrote:
> > > > > Hi,
> > > > >
> > > > > I ran a simple test using a Linux 5.12 client NFSv4.2 mount
> > > > > against a FreeBSD pNFS server, where the DS is out of space
> > > > > (intentionally, by creating a large file on it).
> > > > >
> > > > > I tried to write a file on the Linux NFS client mount and the
> > > > > mount point gets "stuck" (will not <ctrl>C nor "umount -f").
> > > > > --> The client is attempting writes against the DS repeatedly,
> > > > >        with the DS replying NFS4ERR_NOSPC. (Same byte offsets,
> > > > >        over and over and over again.)
> > > > > --> The client is repeatedly sending RPCs with LayoutError in
> > > > >        them to the MDS, reporting the NFS4ERR_NOSPC.
> > > > >
> > > > > I'll leave it up to others, but failing the program trying to
> > > > > write the file with ENOSPC would seem preferable to the
> > > > > "stuck" mount?
> > > > > --> Removing the large file from the DS so that the Writes
> > > > >       can succeed does cause the client to recover.
> > > > >
> > > >
> > > > The client expectation is that the MDS will either remedy the
> > > > situation, or it will return an appropriate application-level error
> > > > to
> > > > the LAYOUTGET.
> > > Thanks Trond, that worked fine for NFSv4.2. I tweaked the pNFS server
> > > to reply NFS4ERR_NOSPC to LayoutGet and that worked fine.
> > > (This is triggered by the LayoutError.)
> > >
> > > For NFSv4.1, things don't work as well, since there is no LayoutError
> > > operation. The LayoutReturn has the NFS4ERR_NOSPC error in it,
> > > but that doesn't happen until it finishes (which doesn't happen until
> > > I free up space on the DS).
> >
> > Hmm... The ENOSPC error from the DS should in principle be marking the
> > layout for return. You're saying that the return isn't happening?
> Not until the end, after I have deleted the large file, so there is space on the
> DS for the writes. It is in the same compound as Close.
> The packet capture is here, in case you are interested:
> https://people.freebsd.org/~rmacklem/linux-ds-out-of-space.pcap
> (Taken at the MDS, so it doesn't show the DS RPCs, but they're just
>  a lot of writes that fail with NFS4ERR_NOSPC until near the end.)
> 
> If you look, you'll see it gets a layout for the entire file first,
> then it repeatedly does LayoutGets that are a little weird.
> - For 4K only, but always on for an offset that is an exact multiple
>    of 1Mbyte.
> --> Then, once I free up space on the DS, it does the compound
>       that includes both Close and LayoutReturn (which has the
>       NFS4ERR_NOSPC error report in it).
>
> > Does a newer client fix the issue?
> This was 5.12. I'll build/test a newer kernel in the next couple of
> days and report back (it's an old single core i386, so it takes a while;-).
5.15.1 exhibits the same behaviour. The only difference is that LayoutReturn
was in a separate RPC from Close, but still didn't happen until the
end, after I free'd up space on the DS and the writes to the DS
succeeded. (This time I had delegations enabled, which might be why
the LayoutReturn wasn't in the same compound RPC as Close?)

rick

> rick
>
> > > But I can live with only 4.2 working well. I can't be bothered
> > > endlessly
> > > probing the DSs to see if they are out of space.
>
> > Agreed. Your server should be able to rely on the layout error reports
> > from the client (either in LAYOUTERROR or in the LAYOUTRETURN) in order
> > to figure out when the DS might be out of space.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux