Re: NFS4 in combination with root over NFS3, hangs and deadlocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, I can be wrong in my guess,

But I found another report earlier in mailing list archive.

Subject was "NFS regression? Odd delays and lockups accessing an NFS export." with last message from 2008-09-27 10:16:26

There were a lot of traffic with attempts to investigate problem. But I didn't find information was it resolved or not.





On Mar 31, 2010, at 2:09 AM, Anton Starikov wrote:

> I'm not an expert in kernel debugging, but I think hang happens in rpcauth_lookup_credcache
> 
> 
> On Mar 30, 2010, at 10:59 PM, Anton Starikov wrote:
> 
>> Then it isn't normal.
>> Diskless setup is limited by old NFS3 for non-root partition, which isn't nice.
>> no proper ACL, no delegations.
>> 
>> 
>> On Mar 30, 2010, at 9:27 PM, Chuck Lever wrote:
>> 
>>> On 03/30/2010 03:11 PM, Anton Starikov wrote:
>>>> On Mar 30, 2010, at 9:00 PM, Chuck Lever wrote:
>>>> 
>>>>> On 03/30/2010 02:30 PM, Anton Starikov wrote:
>>>>>> If it is already resolved problem, can someone point me into direction of particular patch?
>>>>> 
>>>>> As far as I know NFSv4 is known not to work with an NFSv3 root, in any kernel.
>>>> 
>>>> 
>>>> But NFS4-root (does it work finally?) isn't always desirable solution. Especially if different OSes used for client/server.
>>>> 
>>>> And it seems that generally it works, just some deadlock occurs, probably related to caching of some credentials.
>>> 
>>> No, NFSv4 root is known to have problems, and is unsupported, as far as I know.
>>> 
>>>> Anton,
>>>> 
>>>>>> Anton.
>>>>>> 
>>>>>> 
>>>>>> On Mar 29, 2010, at 5:14 PM, Anton Starikov wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Early (year ago and recently) I reported about my faults in getting working NFS4 mounts (primary automounting /home) with system booted with NFSv3-root. It always used to silently hang nodes with zero output in the logs. It was definitely client issue (I tried it with different versions of linux and solaris servers)
>>>>>>> 
>>>>>>> Although I can't get simple and reproducible test-case, because hangs appears randomly, it can happen in 1hour, it can happen in 5 days, but it always will happen after some time. But this time I got some some improvement.
>>>>>>> 
>>>>>>> With 2.6.32.9-70.fc12.x86_64 kernel and fresh nfs-utils from Fedora-12, after NFS4 mounts hangs, NFS3 mounts and node itself still continue to work, which gives chance to investigate problem.
>>>>>>> 
>>>>>>> Can you give me instruction how to collect all necessary information to figure out where the bug is?
>>>>>>> 
>>>>>>> As starting point I will attach output of echo "t">   sysrq-trigge, list of NFS mounts.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Anton.
>>>>>>> 
>>>>>>> # cat /proc/mounts | grep nfs
>>>>>>> 172.19.8.1:/export/share/cluster/fedora-root / nfs ro,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nolock,proto=udp,port=65535,timeo=7,retrans=3,sec=sys,mountport=65535,addr=172.19.8.1 0 0
>>>>>>> none /var/lib/nfs tmpfs rw,relatime 0 0
>>>>>>> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
>>>>>>> 172.19.8.1:/export/share/cluster/admin /root nfs rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.19.8.1,mountvers=3,mountport=44114,mountproto=tcp,addr=172.19.8.1 0 0
>>>>>>> 172.19.8.1:/export/share/cluster/checkpoint /mnt/checkpoint nfs rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.19.8.1,mountvers=3,mountport=52574,mountproto=udp,addr=172.19.8.1 0 0
>>>>>>> 172.19.8.1:/export/share/software /software nfs rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.19.8.1,mountvers=3,mountport=44114,mountproto=tcp,addr=172.19.8.1 0 0
>>>>>>> 172.19.8.1:/export/share/cluster/torque /var/torque nfs rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.19.8.1,mountvers=3,mountport=44114,mountproto=tcp,addr=172.19.8.1 0 0
>>>>>>> 172.19.8.1:/export/share/common/ /common nfs4 rw,noatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.19.8.133,addr=172.19.8.1 0 0
>>>>>>> 172.19.8.1:/export/home/alfons/ /home/alfons nfs4 rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.19.8.133,addr=172.19.8.1 0 0
>>>>>>> 
>>>>>>> <log1.txt.gz>
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> 
>>>>> 
>>>>> --
>>>>> chuck[dot]lever[at]oracle[dot]com
>>>> 
>>> 
>>> 
>>> -- 
>>> chuck[dot]lever[at]oracle[dot]com
>> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux