Thanks, applied (and replaced the earlier patch in the for-2.6.27 branch at git://linux-nfs.org/~bfields/linux.git for-2.6.27 I saw sort of curious if I could convince myself to go through this cycle only appending to that branch. I guess not. Maybe next time.) --b. On Thu, Jun 19, 2008 at 10:11:09AM +1000, NeilBrown wrote: > > OCFS2 can return -ERESTARTSYS from write requests (and possibly > elsewhere) if there is a signal pending. > > If nfsd is shutdown (by sending a signal to each thread) while there > is still an IO load from the client, each thread could handle one last > request with a signal pending. This can result in -ERESTARTSYS > which is not understood by nfserrno() and so is reflected back to > the client as nfserr_io aka -EIO. This is wrong. > > Instead, interpret ERESTARTSYS to mean "try again later" by returning > nfserr_jukebox. The client will resend and - if the server is > restarted - the write will (hopefully) be successful and everyone will > be happy. > > The symptom that I narrowed down to this was: > copy a large file via NFS to an OCFS2 filesystem, and restart > the nfs server during the copy. > The 'cp' might get an -EIO, and the file will be corrupted - > presumably holes in the middle where writes appeared to fail. > > > Signed-off-by: Neil Brown <neilb@xxxxxxx> > > ### Diffstat output > ./fs/nfsd/nfsproc.c | 1 + > 1 file changed, 1 insertion(+) > > diff .prev/fs/nfsd/nfsproc.c ./fs/nfsd/nfsproc.c > --- .prev/fs/nfsd/nfsproc.c 2008-06-19 10:06:36.000000000 +1000 > +++ ./fs/nfsd/nfsproc.c 2008-06-19 10:07:58.000000000 +1000 > @@ -614,6 +614,7 @@ nfserrno (int errno) > #endif > { nfserr_stale, -ESTALE }, > { nfserr_jukebox, -ETIMEDOUT }, > + { nfserr_jukebox, -ERESTARTSYS }, > { nfserr_dropit, -EAGAIN }, > { nfserr_dropit, -ENOMEM }, > { nfserr_badname, -ESRCH }, -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html