Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > On Nov 12, 2011, at 1:49 PM, Jim Rees wrote: > >> The question for us is how long should an nfsroot client wait for the > server >> to reply. It sounds like the client used to wait longer than it does now. > > Before, the client performed the GETPORT(NFS) step synchronously, first. This > took 30 seconds or so to timeout. When it did, the client decided to proceed > with port 2049. Then it went on to do the other mount tasks, and at the point > had waited long enough that these tasks did not time out while waiting for the > switch port. > >> It seems to me the client should wait at least 90 seconds so that the >> situation you're in (servers on non-portfast ports) will work. I would >> think they should wait indefinitely, since there's not much else they > can >> do. > > It should be simple to wrap the (MNT(mnt), NFS(getroot)) steps in a while(true) > loop. Would mount_root_nfs() be the right place for this? > I thought it would be harder and I had no time to look inside the kernel but now I wrote a patch: The kernel tries to create the MNT RPC client not once as before but three times - then it gives up. Third time lucky... ;-) In my case the 2. MNT request is successful: --- [ 71.594744] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 72.617007] IP-Config: Complete: [ 72.617077] device=eth0, addr=137.226.167.242, mask=255.255.255.224, gw=137.226.167.225, [ 72.617278] host=137.226.167.242, domain=, nis-domain=(none), [ 72.617393] bootserver=255.255.255.255, rootserver=137.226.167.241, rootpath= [ 72.617741] Root-NFS: nfsroot=/srv/nfs/cluster2 [ 72.618010] NFS: nfs mount opts='udp,nolock,addr=137.226.167.241' [ 72.618147] NFS: parsing nfs mount option 'udp' [ 72.618187] NFS: parsing nfs mount option 'nolock' [ 72.618233] NFS: parsing nfs mount option 'addr=137.226.167.241' [ 72.618301] NFS: MNTPATH: '/srv/nfs/cluster2' [ 72.618335] NFS: sending MNT request for 137.226.167.241:/srv/nfs/cluster2 [ 72.618383] NFS: 1. MNT request [ 73.691872] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx [ 73.711988] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 107.697332] NFS: 2. MNT request [ 107.704591] NFS: received 1 auth flavors [ 107.704653] NFS: auth flavor[0]: 1 [ 107.704834] NFS: MNT request succeeded [ 107.704897] NFS: using auth flavor 1 [ 107.711857] VFS: Mounted root (nfs filesystem) on device 0:13. INIT: version 2.88 booting --- So many thanks again for your help and your very helpful hints! Regards, Lukas PS: That's what I've done: --- linux-2.6.39.4/fs/nfs/mount_clnt.c 2011-08-03 21:43:28.000000000 +0200 +++ linux-2.6.39.4-fix/fs/nfs/mount_clnt.c 2011-11-13 01:58:13.000000000 +0100 @@ -164,6 +164,7 @@ }; struct rpc_clnt *mnt_clnt; int status; + int attempt = 0; dprintk("NFS: sending MNT request for %s:%s\n", (info->hostname ? info->hostname : "server"), @@ -172,7 +173,13 @@ if (info->noresvport) args.flags |= RPC_CLNT_CREATE_NONPRIVPORT; - mnt_clnt = rpc_create(&args); + do { + attempt++; + dprintk("NFS: %d. MNT request\n", attempt); + mnt_clnt = rpc_create(&args); + } while (IS_ERR(mnt_clnt) && attempt < 3); + + if (IS_ERR(mnt_clnt)) goto out_clnt_err; -- -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html