Re: [BUG?] Maybe NFS bug since 2.6.37 on SPARC64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Nov 3, 2011, at 5:37 PM, Lukas Razik wrote:

>> On Nov 3, 2011, at 5:11 PM, Jim Rees wrote:
> 
>> 
>>> Trond Myklebust wrote:
>>> 
>>>> [ 442.666622] NFS: failed to create MNT RPC client, status=-60
>>>> [ 442.666732] NFS: unable to mount server 137.226.167.241, error -60
>>>> [ 442.666868] VFS: Unable to mount root fs via NFS, trying floppy.
>>>> [ 442.667032] VFS: Insert root floppy and press ENTER
>>>> 
>>>   Error 60 is ETIMEDOUT on SPARC, so it seems that the problem is
>>>   basically the same one that you see in your 2.6.32 trace (rpcbind:
>>>   server 137.226.167.241 not responding, timed out) except that now it is
>>>   a fatal error.
>>> 
>>>   Any idea why the first RPC calls might be failing here? A switch
>>>   misconfiguration or something like that perhaps?
>>> 
>>> Wasn't there a change in the way nfs mount options are handled by the 
>> kernel
>>> for nfsroot about the time of 2.6.39?  Something about changing from 
>> default
>>> udp to tcp maybe?
>> 
>> There was a change, but it was changed back to UDP because of problems like 
>> this.  Behavior in 3.0 or the latest 2.6.39 stable kernel may be improved.
>> 
> 
> I don't know if this was a tip to test newest 2.6.39 but as I wrote in my first email
>  http://thread.gmane.org/gmane.linux.nfs/44596
> that's the output of linux-2.6.39.4 with "nfsdebug":
> 
> [ 407.571521] IP-Config: Complete:
> [ 407.571589] device=eth0, addr=137.226.167.242, mask=255.255.255.224, gw=137.226.167.225,
> [ 407.571793] host=cluster2, domain=, nis-domain=(none),
> [ 407.571907] bootserver=255.255.255.255, rootserver=137.226.167.241, rootpath=
> [ 407.572332] Root-NFS: nfsroot=/srv/nfs/cluster2
> [ 407.572726] NFS: nfs mount opts='udp,nolock,addr=137.226.167.241'
> [ 407.572927] NFS: parsing nfs mount option 'udp'
> [ 407.572995] NFS: parsing nfs mount option 'nolock'
> [ 407.573071] NFS: parsing nfs mount option 'addr=137.226.167.241'
> [ 407.573139] NFS: MNTPATH: '/srv/nfs/cluster2'
> [ 407.573203] NFS: sending MNT request for 137.226.167.241:/srv/nfs/cluster2
> [ 408.617894] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx
> [ 408.638319] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [ 442.666622] NFS: failed to create MNT RPC client, status=-60
> [ 442.666732] NFS: unable to mount server 137.226.167.241, error -60
> [ 442.666868] VFS: Unable to mount root fs via NFS, trying floppy.
> [ 442.667032] VFS: Insert root floppy and press ENTER
> 
> And this behaviour is exactly the same as in all other 2.6.37 - 2.6.39.4 which I've tested.
> So if anybody of you all have an idea what I could try to do, I'll follow...

Find out why the very first RPC on your system always fails.  As Trond says, the only reason this worked on the older kernels is because NFSROOT fell back to a default port for NFSD.  This is also broken behavior, but in your case it happened to work so you never noticed it.

I seem to recall there's a way to set the NFS and RPC debugging flags on the kernel command line so more information can be captured during boot.  But I don't see it under Documentation/.

You could add a line in fs/nfs/nfsroot.c:nfs_root_debug() to set flags also in the rpc_debug global variable to gather more information.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux