On Mon, Mar 01, 2010 at 04:01:42PM +0100, Anton Starikov wrote: > Hi, > > > my config is diskless NFSv3 nfsroot (+ some extra NFDSv3 mounts) and NFSv4 /home/* automount. > Centos 5.4, kernel 2.6.18-164.11.1.el5. That's the client? What's the server? That's pretty old kernel; I'd file a bug with CentOS. > Periodically my nodes hangs, nothing appeared in the logs (remote syslog + netconsole). > Node is kind of alive, you can ping, some deamons (for example pbs_mom) reports that it's alive etc. > But anything which require FS access - frozen. > > Another symptom, it looks like portmap doesn't answer. At lease if I try "rpcinfo -p node_name", then it ends with > "rpcinfo: can't contact portmapper: rpcinfo: RPC: Timed out" > > In principal, this can have something with locking. > At least, I had to mount all my NFSv3 mounts with nolock, to reduce frequency of problem (nfsroot was nolock, obviously. but there are couple of extra v3 mounts, like /opt with extra software and RW directory for torque. > > What can be a problem here? > > What kind of information I have to collect from system to figure out what it real problem? Is there any server-side logging? Can you see any interesting network traffic after the hang? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html