On Sun, Oct 4, 2009 at 10:22 PM, Daniel J Blueman <daniel.blueman@xxxxxxxxx> wrote: > On Sun, Oct 4, 2009 at 11:10 PM, Trond Myklebust > <Trond.Myklebust@xxxxxxxxxx> wrote: >> On Sat, 2009-10-03 at 16:59 +0100, Daniel J Blueman wrote: >>> Hi Trond, >>> >>> On Mon, Sep 28, 2009 at 7:16 PM, Trond Myklebust >>> <Trond.Myklebust@xxxxxxxxxx> wrote: >>> > On Sat, 2009-09-26 at 19:14 +0100, Daniel J Blueman wrote: >>> >> Hi Trond, >>> >> >>> >> After rebooting my 2.6.31 NFS4 server, I see a list of NFS kernel >>> >> errors [1] on the 2.6.31 client corresponding to NFS4ERR_GRACE, so >>> >> lock or file state recovery failed. Is this expected noting that I >>> >> have an internal firewall allowing incoming TCP port 2049 on the >>> >> server, and no firewall on the client, however I can't see how it can >>> >> thus be callback related? >>> > >>> > No. It looks as if your server rebooted while the client was recovering >>> > an expired lease. >>> > >>> > The following patch should prevent future occurrences of this bug... >>> > >>> > Cheers >>> > Trond >>> > ------------------------------------------------------------------ >>> > NFSv4: Handle NFS4ERR_GRACE when recovering an expired lease. >>> > >>> > From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> >>> > >>> > If our lease expires, and the server subsequently reboot, we need to be >>> > able to handle the case where the server refuses to let us recover state, >>> > because it is in the grace period. >>> >>> On the client, I didn't see the error messages with this patch, >>> however I did see firefox (via sqlite) continue to hang [1] (after >>> other processes continued), and an unusual level of activity with >>> rpciod/0 and rpciod/1 kernel threads. Other NFS-related kernel thread >>> state is given. >> >> What are your mount options? > > $ grep nfs /proc/mounts > x1:/ /net nfs4 rw,relatime,vers=4,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.10.2,addr=192.168.10.250 > 0 0 > > All procfs settings are default; let me know if anything else will > help and thanks for taking a look! In the same situation but with 2.6.32-rc5 on the server and 2.6.31.4 on the client, I see on the client's kernel log "nfs4_reclaim_open_state: Lock reclaim failed", and the application (reproducible with firefox) shows a failure mode (eg empty lists in live bookmarks). Is this expected behaviour, ie is there a finite state recovery window? Many thanks, Daniel -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html