On Mon, Jan 26, 2015 at 12:26:53AM +0000, Peter Auyeung wrote: > Hi Niels, > > The question if we keep getting the lockd error even after restart and > rebooted the NFS client.. This particular error would only occur when the NFS-server could not register the nlockmgr RPC-program to rpcbind/portmapper. The most likely scenario where this fails, is where there is an NFS-client (or service) on the storage server that conflicts with the Gluster/NFS service. If there are conflicting RPC services in rpcbind/portmapper, you may be able check and remove those with the 'rpcinfo' command. Ports that are listed in te output, but are not listed in netstat/ss are in used by kernel services (like the lockd kernel module). In order to restore the NLM function of Gluster/NFS, you can take these steps: 1. ensure that there are no other NFS-services (server or client) running on the Gluster storage server. Gluster/NFS should be the only service which does some NFS on the server. 2. stop the rpcbind service 3. clear the rpcbind-cache (rm /var/lib/rpcbind/portmap.xdr) 4. start the rpcbind service 5. restart the Gluster/NFS service In case your NFS-client got connected to the incorrect NLM service on your storage server, you would need to unmount and mount the export again. Niels > > Peter > ________________________________________ > From: Niels de Vos [ndevos@xxxxxxxxxx] > Sent: Saturday, January 24, 2015 3:26 AM > To: Peter Auyeung > Cc: gluster-users@xxxxxxxxxxx; gluster-devel@xxxxxxxxxxx > Subject: Re: [Gluster-devel] lockd: server not responding, timed out > > On Fri, Jan 23, 2015 at 11:50:26PM +0000, Peter Auyeung wrote: > > We have a 6 nodes gluster running ubuntu on xfs sharing gluster > > volumes over NFS been running fine for 3 months. > > We restarted glusterfs-server on one of the node and all NFS clients > > start getting the " lockd: server not responding, timed out" on > > /var/log/messages > > > > We are still able to read write but seems like process that require a > > persistent file lock failed like database exports. > > > > We have an interim fix to remount the NFS with nolock option but need > > to know why that is necessary all in a sudden after a service > > glusterfs-server restart on one of the gluster node > > The cause that you need to mount wiht 'nolock' is that one server can > only have one NLM-service active. The Linux NFS-client uses the 'lockd' > kernel module, and the Gluster/NFS server provides its own lock manager. > To be able to use a lock manager, it needs to be registered at > rpcbind/portmapper. Only one lock manager can be registered at a time, > the 2nd one that tries to register will fail. In case the NFS-client has > registered the lockd kernel module as lock manager, any locking requests > to the Gluster/NFS service will fail and you will see those messages in > /var/log/messages. > > This is one of the main reasons why it is not advised to access volumes > over NFS on a Gluster storage server. You should rather use the > GlusterFS protocol for mounting volumes locally. (Or even better, > seperate your storage servers from the application servers.) > > HTH, > Niels
Attachment:
pgpyCjfqla1Pd.pgp
Description: PGP signature
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users