Hi, Thank you for you reply. I did your recommendations, but there are no changes. In the nfs.log there are no new things. [root@node0 glusterfs]# reboot Connection to 172.16.0.10 closed by remote host. Connection to 172.16.0.10 closed. [tdemeter@sirius-31 ~]$ ssh root@172.16.0.10 root@172.16.0.10's password: Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106 [root@node0 ~]# systemctl status nfs.target nfs.target - Network File System Server Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled) Active: inactive (dead) [root@node0 ~]# gluster volume status engine Status of volume: engine Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick gs00.itsmart.cloud:/gluster/engine0 50160 Y 3271 Brick gs01.itsmart.cloud:/gluster/engine1 50160 Y 595 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A Y 3286 NFS Server on gs01.itsmart.cloud 2049 Y 6951 Self-heal Daemon on gs01.itsmart.cloud N/A Y 6958 Task Status of Volume engine ------------------------------------------------------------------------------ There are no active volume tasks [root@node0 ~]# systemctl status Display all 262 possibilities? (y or n) [root@node0 ~]# systemctl status nfs-lock nfs-lock.service - NFS file locking service. Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled) Active: inactive (dead) [root@node0 ~]# systemctl stop nfs-lock [root@node0 ~]# systemctl restart gluster glusterd.service glusterfsd.service gluster.mount [root@node0 ~]# systemctl restart gluster glusterd.service glusterfsd.service gluster.mount [root@node0 ~]# systemctl restart glusterfsd.service [root@node0 ~]# systemctl restart glusterd.service [root@node0 ~]# gluster volume status engine Status of volume: engine Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick gs00.itsmart.cloud:/gluster/engine0 50160 Y 5140 Brick gs01.itsmart.cloud:/gluster/engine1 50160 Y 2037 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A N N/A NFS Server on gs01.itsmart.cloud 2049 Y 6951 Self-heal Daemon on gs01.itsmart.cloud N/A Y 6958 Any other idea? Tibor ----- Eredeti üzenet ----- > On Mon, Oct 20, 2014 at 09:04:2.8AM +0200, Demeter Tibor wrote: > > Hi, > > > > This is the full nfs.log after delete & reboot. > > It is refers to portmap registering problem. > > > > [root@node0 glusterfs]# cat nfs.log > > [2014-10-20 06:48:43.221136] I [glusterfsd.c:1959:main] > > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2 > > (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p > > /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S > > /var/run/567e0bba7ad7102eae3049e2ad6c3ed7.socket) > > [2014-10-20 06:48:43.224444] I [socket.c:3561:socket_init] > > 0-socket.glusterfsd: SSL support is NOT enabled > > [2014-10-20 06:48:43.224475] I [socket.c:3576:socket_init] > > 0-socket.glusterfsd: using system polling thread > > [2014-10-20 06:48:43.224654] I [socket.c:3561:socket_init] 0-glusterfs: SSL > > support is NOT enabled > > [2014-10-20 06:48:43.224667] I [socket.c:3576:socket_init] 0-glusterfs: > > using system polling thread > > [2014-10-20 06:48:43.235876] I > > [rpcsvc.c:2127:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured > > rpc.outstanding-rpc-limit with value 16 > > [2014-10-20 06:48:43.254087] I [socket.c:3561:socket_init] > > 0-socket.nfs-server: SSL support is NOT enabled > > [2014-10-20 06:48:43.254116] I [socket.c:3576:socket_init] > > 0-socket.nfs-server: using system polling thread > > [2014-10-20 06:48:43.255241] I [socket.c:3561:socket_init] > > 0-socket.nfs-server: SSL support is NOT enabled > > [2014-10-20 06:48:43.255264] I [socket.c:3576:socket_init] > > 0-socket.nfs-server: using system polling thread > > [2014-10-20 06:48:43.257279] I [socket.c:3561:socket_init] > > 0-socket.nfs-server: SSL support is NOT enabled > > [2014-10-20 06:48:43.257315] I [socket.c:3576:socket_init] > > 0-socket.nfs-server: using system polling thread > > [2014-10-20 06:48:43.258135] I [socket.c:3561:socket_init] 0-socket.NLM: > > SSL support is NOT enabled > > [2014-10-20 06:48:43.258157] I [socket.c:3576:socket_init] 0-socket.NLM: > > using system polling thread > > [2014-10-20 06:48:43.293724] E > > [rpcsvc.c:1314:rpcsvc_program_register_portmap] 0-rpc-service: Could not > > register with portmap > > [2014-10-20 06:48:43.293760] E [nfs.c:332:nfs_init_versions] 0-nfs: Program > > NLM4 registration failed > > The above line suggests that there already is a service registered at > portmapper for the NLM4 program/service. This happens when the kernel > module 'lockd' is loaded. The kernel NFS-client and NFS-server depend on > this, but unfortunately it conflicts with the Gluster/nfs server. > > Could you verify that the module is loaded? > - use 'lsmod | grep lockd' to check the modules > - use 'rpcinfo | grep nlockmgr' to check the rpcbind registrations > > Make sure that you do not mount any NFS exports on the Gluster server. > Unmount all NFS mounts. > > You mentioned you are running CentOS-7, which is systemd based. You > should be able to stop any conflicting NFS services like this: > > # systemctl stop nfs-lock.service > # systemctl stop nfs.target > # systemctl disable nfs.target > > If all these services cleanup themselves, you should be able to start > the Gluster/nfs service: > > # systemctl restart glusterd.service > > In case some bits are still lingering around, it might be easier to > reboot after disabling the 'nfs.target'. > > > [2014-10-20 06:48:43.293771] E [nfs.c:1312:init] 0-nfs: Failed to > > initialize protocols > > [2014-10-20 06:48:43.293777] E [xlator.c:403:xlator_init] 0-nfs-server: > > Initialization of volume 'nfs-server' failed, review your volfile again > > [2014-10-20 06:48:43.293783] E [graph.c:307:glusterfs_graph_init] > > 0-nfs-server: initializing translator failed > > [2014-10-20 06:48:43.293789] E [graph.c:502:glusterfs_graph_activate] > > 0-graph: init failed > > pending frames: > > frame : type(0) op(0) > > > > patchset: git://git.gluster.com/glusterfs.git > > signal received: 11 > > time of crash: 2014-10-20 06:48:43configuration details: > > argp 1 > > backtrace 1 > > dlfcn 1 > > fdatasync 1 > > libpthread 1 > > llistxattr 1 > > setfsid 1 > > spinlock 1 > > epoll.h 1 > > xattr.h 1 > > st_atim.tv_nsec 1 > > package-string: glusterfs 3.5.2 > > [root@node0 glusterfs]# systemctl status portma > > portma.service > > Loaded: not-found (Reason: No such file or directory) > > Active: inactive (dead) > > > > > > > > Also I have checked the rpcbind service. > > > > [root@node0 glusterfs]# systemctl status rpcbind.service > > rpcbind.service - RPC bind service > > Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled) > > Active: active (running) since h 2014-10-20 08:48:39 CEST; 2min 52s ago > > Process: 1940 ExecStart=/sbin/rpcbind -w ${RPCBIND_ARGS} (code=exited, > > status=0/SUCCESS) > > Main PID: 1946 (rpcbind) > > CGroup: /system.slice/rpcbind.service > > └─1946 /sbin/rpcbind -w > > > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Starting RPC bind > > service... > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Started RPC bind service. > > > > The restart does not solve this problem. > > > > > > I think this is the problem. Why are "exited" the portmap status? > > The 'portmap' service has been replaced with 'rpcbind' since RHEL-6. > They have the same functionality, 'rpcbind' just happens to be the newer > version. > > Did you file a bug for this already? As Vijay mentions, this crash seems > to happen because the Gluster/nfs service fails to initialize correctly > and then fails to cleanup correctly. The cleanup should get fixed, and > we should also give an easier to understand error message. > > Thanks, > Niels > > > > > > > On node1 is ok: > > > > [root@node1 ~]# systemctl status rpcbind.service > > rpcbind.service - RPC bind service > > Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled) > > Active: active (running) since p 2014-10-17 19:15:21 CEST; 2 days ago > > Main PID: 1963 (rpcbind) > > CGroup: /system.slice/rpcbind.service > > └─1963 /sbin/rpcbind -w > > > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Starting RPC bind > > service... > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Started RPC bind service. > > > > > > > > Thanks in advance > > > > Tibor > > > > > > > > ----- Eredeti üzenet ----- > > > On 10/19/2014 06:56 PM, Niels de Vos wrote: > > > > On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote: > > > >> Hi, > > > >> > > > >> [root@node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log > > > >> [2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init] > > > >> 0-nfs-server: initializing translator failed > > > >> [2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate] > > > >> 0-graph: init failed > > > >> pending frames: > > > >> frame : type(0) op(0) > > > >> > > > >> patchset: git://git.gluster.com/glusterfs.git > > > >> signal received: 11 > > > >> time of crash: 2014-10-18 07:41:06configuration details: > > > >> argp 1 > > > >> backtrace 1 > > > >> dlfcn 1 > > > >> fdatasync 1 > > > >> libpthread 1 > > > >> llistxattr 1 > > > >> setfsid 1 > > > >> spinlock 1 > > > >> epoll.h 1 > > > >> xattr.h 1 > > > >> st_atim.tv_nsec 1 > > > >> package-string: glusterfs 3.5.2 > > > > > > > > This definitely is a gluster/nfs issue. For whatever reasone, the > > > > gluster/nfs server crashes :-/ The log does not show enough details, > > > > some more lines before this are needed. > > > > > > > > > > I wonder if the crash is due to a cleanup after the translator > > > initialization failure. The complete logs might help in understanding > > > why the initialization failed. > > > > > > -Vijay > > > > > > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users