On Fri, Oct 17, 2008 at 04:26:18PM -0400, Talpey, Thomas wrote: > At 02:59 PM 10/17/2008, Marc Eshel wrote: > >linux-nfs-owner@xxxxxxxxxxxxxxx wrote on 10/17/2008 10:44:54 AM: > > > >> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> > >> Requests longer than a page are still not deferred, so large writes that > >> trigger upcalls still get an ERR_DELAY. OK, probably no big deal. > >> > >> I don't think we can apply this until we have some way to track the > >> number and size of deferred requests outstanding and fall back on > >> ERR_DELAY if it's too much. > > > >But I thought that the problem here is that the Linux NFS client doesn't > >handle this return code properly. > > Definitely this is an issue. Early clients do one of two things, they either > pass the error back to the application, or they enter a buzz loop resending > the operation with no delay. Later clients back off, but for a constant > five seconds. I haven't tested it, but from fs/nfs/nfs4proc.c:nfs4_delay() it appears to start at a tenth of a second and then do exponential backoff (up to 15 seconds). Looks to me like the code's been that way since at least 2.6.19. --b. > Either way, the server is generally better off gritting its > teeth and completing the operation. > > Blocking server threads is drastic, but in effect it will stall the client > queues and "push back". The issue on Linux is the small number of > nfsd contexts involved. It could lead to significant issues possibly > including DOS attack. Dropping connections (judiciously) could be > used instead of blocking the last few threads, though even that will > have consequences. > > The easy way to test all this is decorate /etc/exports with lots of > names, then break the nameservice and start sending requests from > many new clients. It's very hard to get it all right. > > Tom. > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html