Re: NFS not start on localhost

Gene Liverman <gliverma@xxxxxxxxxx> · Thu, 23 Oct 2014 13:03:57 -0400

Could you also provide the output of this command:

$ mount | column -t

--

Gene Liverman

Systems Administrator

Information Technology Services

University of West Georgia

gliverma@xxxxxxxxxx
ITS: Making Technology Work for You!

This e-mail and any attachments may contain confidential and privileged information. If you are not the intended recipient, please notify the sender immediately by return mail, delete this message, and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal or actionable by law.

On Oct 23, 2014 10:07 AM, "Niels de Vos" <ndevos@xxxxxxxxxx> wrote:
The only way I can manage to hit this issue too, is by mounting an

NFS-export on the Gluster server that starts the Gluster/NFS process.

There is not crash happening on my side, Gluster/NFS just fails to

start.

Steps to reproduce:

1. mount -t nfs nas.example.net:/export /mnt

2. systemctl start glusterd

After this, the error about being unable to register NLM4 is in

/var/log/glusterfs/nfs.log.

This is expected, because the Linux kernel NFS-server requires an NLM

service in portmap/rpcbind (nlockmgr). You can verify what process

occupies the service slot in rpcbind like this:

1. list the rpc-programs and their port numbers

    # rpcinfo -p

2. check the process that listens on the TCP-port for nlockmgr (port

   32770 was returned by the command from point 1)

    # netstat -nlpt | grep -w 32770

If the right column in the output lists 'glusterfs', then the

Gluster/NFS process could register successfully and is handling the NLM4

calls. However, if the right columnt contains a single '-', the Linux

kernel module 'lockd' is handling the NLM4 calls. Gluster/NFS can not

work together with the Linux kernel NFS-client (mountpoint) or the Linux

kernel NFS-server.

Does this help? If something is unclear, post the output  if the above

commands and tell us what further details you want to see clarified.

Cheers,

Niels

On Mon, Oct 20, 2014 at 12:53:46PM +0200, Demeter Tibor wrote:

>

> Hi,

>

> Thank you for you reply.

>

> I did your recommendations, but there are no changes.

>

> In the nfs.log there are no new things.

>

>

> [root@node0 glusterfs]# reboot

> Connection to 172.16.0.10 closed by remote host.

> Connection to 172.16.0.10 closed.

> [tdemeter@sirius-31 ~]$ ssh root@172.16.0.10

> root@172.16.0.10's password:

> Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106

> [root@node0 ~]# systemctl status nfs.target

> nfs.target - Network File System Server

>    Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled)

>    Active: inactive (dead)

>

> [root@node0 ~]# gluster volume status engine

> Status of volume: engine

> Gluster process                                               Port    Online  Pid

> ------------------------------------------------------------------------------

> Brick gs00.itsmart.cloud:/gluster/engine0             50160   Y       3271

> Brick gs01.itsmart.cloud:/gluster/engine1             50160   Y       595

> NFS Server on localhost                                       N/A     N       N/A

> Self-heal Daemon on localhost                         N/A     Y       3286

> NFS Server on gs01.itsmart.cloud                      2049    Y       6951

> Self-heal Daemon on gs01.itsmart.cloud                        N/A     Y       6958

>

> Task Status of Volume engine

> ------------------------------------------------------------------------------

> There are no active volume tasks

>

> [root@node0 ~]# systemctl status

> Display all 262 possibilities? (y or n)

> [root@node0 ~]# systemctl status nfs-lock

> nfs-lock.service - NFS file locking service.

>    Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled)

>    Active: inactive (dead)

>

> [root@node0 ~]# systemctl stop nfs-lock

> [root@node0 ~]# systemctl restart gluster

> glusterd.service    glusterfsd.service  gluster.mount

> [root@node0 ~]# systemctl restart gluster

> glusterd.service    glusterfsd.service  gluster.mount

> [root@node0 ~]# systemctl restart glusterfsd.service

> [root@node0 ~]# systemctl restart glusterd.service

> [root@node0 ~]# gluster volume status engine

> Status of volume: engine

> Gluster process                                               Port    Online  Pid

> ------------------------------------------------------------------------------

> Brick gs00.itsmart.cloud:/gluster/engine0             50160   Y       5140

> Brick gs01.itsmart.cloud:/gluster/engine1             50160   Y       2037

> NFS Server on localhost                                       N/A     N       N/A

> Self-heal Daemon on localhost                         N/A     N       N/A

> NFS Server on gs01.itsmart.cloud                      2049    Y       6951

> Self-heal Daemon on gs01.itsmart.cloud                        N/A     Y       6958

>

>

> Any other idea?

>

> Tibor

>

>

>

>

>

>

>

>

> ----- Eredeti üzenet -----

> > On Mon, Oct 20, 2014 at 09:04:2.8AM +0200, Demeter Tibor wrote:

> > > Hi,

> > >

> > > This is the full nfs.log after delete & reboot.

> > > It is refers to portmap registering problem.

> > >

> > > [root@node0 glusterfs]# cat nfs.log

> > > [2014-10-20 06:48:43.221136] I [glusterfsd.c:1959:main]

> > > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2

> > > (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p

> > > /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S

> > > /var/run/567e0bba7ad7102eae3049e2ad6c3ed7.socket)

> > > [2014-10-20 06:48:43.224444] I [socket.c:3561:socket_init]

> > > 0-socket.glusterfsd: SSL support is NOT enabled

> > > [2014-10-20 06:48:43.224475] I [socket.c:3576:socket_init]

> > > 0-socket.glusterfsd: using system polling thread

> > > [2014-10-20 06:48:43.224654] I [socket.c:3561:socket_init] 0-glusterfs: SSL

> > > support is NOT enabled

> > > [2014-10-20 06:48:43.224667] I [socket.c:3576:socket_init] 0-glusterfs:

> > > using system polling thread

> > > [2014-10-20 06:48:43.235876] I

> > > [rpcsvc.c:2127:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured

> > > rpc.outstanding-rpc-limit with value 16

> > > [2014-10-20 06:48:43.254087] I [socket.c:3561:socket_init]

> > > 0-socket.nfs-server: SSL support is NOT enabled

> > > [2014-10-20 06:48:43.254116] I [socket.c:3576:socket_init]

> > > 0-socket.nfs-server: using system polling thread

> > > [2014-10-20 06:48:43.255241] I [socket.c:3561:socket_init]

> > > 0-socket.nfs-server: SSL support is NOT enabled

> > > [2014-10-20 06:48:43.255264] I [socket.c:3576:socket_init]

> > > 0-socket.nfs-server: using system polling thread

> > > [2014-10-20 06:48:43.257279] I [socket.c:3561:socket_init]

> > > 0-socket.nfs-server: SSL support is NOT enabled

> > > [2014-10-20 06:48:43.257315] I [socket.c:3576:socket_init]

> > > 0-socket.nfs-server: using system polling thread

> > > [2014-10-20 06:48:43.258135] I [socket.c:3561:socket_init] 0-socket.NLM:

> > > SSL support is NOT enabled

> > > [2014-10-20 06:48:43.258157] I [socket.c:3576:socket_init] 0-socket.NLM:

> > > using system polling thread

> > > [2014-10-20 06:48:43.293724] E

> > > [rpcsvc.c:1314:rpcsvc_program_register_portmap] 0-rpc-service: Could not

> > > register with portmap

> > > [2014-10-20 06:48:43.293760] E [nfs.c:332:nfs_init_versions] 0-nfs: Program

> > > NLM4 registration failed

> >

> > The above line suggests that there already is a service registered at

> > portmapper for the NLM4 program/service. This happens when the kernel

> > module 'lockd' is loaded. The kernel NFS-client and NFS-server depend on

> > this, but unfortunately it conflicts with the Gluster/nfs server.

> >

> > Could you verify that the module is loaded?

> >  - use 'lsmod | grep lockd' to check the modules

> >  - use 'rpcinfo | grep nlockmgr' to check the rpcbind registrations

> >

> > Make sure that you do not mount any NFS exports on the Gluster server.

> > Unmount all NFS mounts.

> >

> > You mentioned you are running CentOS-7, which is systemd based. You

> > should be able to stop any conflicting NFS services like this:

> >

> >  # systemctl stop nfs-lock.service

> >  # systemctl stop nfs.target

> >  # systemctl disable nfs.target

> >

> > If all these services cleanup themselves, you should be able to start

> > the Gluster/nfs service:

> >

> >   # systemctl restart glusterd.service

> >

> > In case some bits are still lingering around, it might be easier to

> > reboot after disabling the 'nfs.target'.

> >

> > > [2014-10-20 06:48:43.293771] E [nfs.c:1312:init] 0-nfs: Failed to

> > > initialize protocols

> > > [2014-10-20 06:48:43.293777] E [xlator.c:403:xlator_init] 0-nfs-server:

> > > Initialization of volume 'nfs-server' failed, review your volfile again

> > > [2014-10-20 06:48:43.293783] E [graph.c:307:glusterfs_graph_init]

> > > 0-nfs-server: initializing translator failed

> > > [2014-10-20 06:48:43.293789] E [graph.c:502:glusterfs_graph_activate]

> > > 0-graph: init failed

> > > pending frames:

> > > frame : type(0) op(0)

> > >

> > > patchset: git://git.gluster.com/glusterfs.git

> > > signal received: 11

> > > time of crash: 2014-10-20 06:48:43configuration details:

> > > argp 1

> > > backtrace 1

> > > dlfcn 1

> > > fdatasync 1

> > > libpthread 1

> > > llistxattr 1

> > > setfsid 1

> > > spinlock 1

> > > epoll.h 1

> > > xattr.h 1

> > > st_atim.tv_nsec 1

> > > package-string: glusterfs 3.5.2

> > > [root@node0 glusterfs]# systemctl status portma

> > > portma.service

> > >    Loaded: not-found (Reason: No such file or directory)

> > >    Active: inactive (dead)

> > >

> > >

> > >

> > > Also I have checked the rpcbind service.

> > >

> > > [root@node0 glusterfs]# systemctl status rpcbind.service

> > > rpcbind.service - RPC bind service

> > >    Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled)

> > >    Active: active (running) since h 2014-10-20 08:48:39 CEST; 2min 52s ago

> > >   Process: 1940 ExecStart=/sbin/rpcbind -w ${RPCBIND_ARGS} (code=exited,

> > >   status=0/SUCCESS)

> > >  Main PID: 1946 (rpcbind)

> > >    CGroup: /system.slice/rpcbind.service

> > >            └─1946 /sbin/rpcbind -w

> > >

> > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Starting RPC bind

> > > service...

> > > okt 20 08:48:39 node0.itsmart.cloud systemd[1]: Started RPC bind service.

> > >

> > > The restart does not solve this problem.

> > >

> > >

> > > I think this is the problem. Why are "exited" the portmap status?

> >

> > The 'portmap' service has been replaced with 'rpcbind' since RHEL-6.

> > They have the same functionality, 'rpcbind' just happens to be the newer

> > version.

> >

> > Did you file a bug for this already? As Vijay mentions, this crash seems

> > to happen because the Gluster/nfs service fails to initialize correctly

> > and then fails to cleanup correctly. The cleanup should get fixed, and

> > we should also give an easier to understand error message.

> >

> > Thanks,

> > Niels

> >

> > >

> > >

> > > On node1 is ok:

> > >

> > > [root@node1 ~]# systemctl status rpcbind.service

> > > rpcbind.service - RPC bind service

> > >    Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled)

> > >    Active: active (running) since p 2014-10-17 19:15:21 CEST; 2 days ago

> > >  Main PID: 1963 (rpcbind)

> > >    CGroup: /system.slice/rpcbind.service

> > >            └─1963 /sbin/rpcbind -w

> > >

> > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Starting RPC bind

> > > service...

> > > okt 17 19:15:21 node1.itsmart.cloud systemd[1]: Started RPC bind service.

> > >

> > >

> > >

> > > Thanks in advance

> > >

> > > Tibor

> > >

> > >

> > >

> > > ----- Eredeti üzenet -----

> > > > On 10/19/2014 06:56 PM, Niels de Vos wrote:

> > > > > On Sat, Oct 18, 2014 at 01:24:12PM +0200, Demeter Tibor wrote:

> > > > >> Hi,

> > > > >>

> > > > >> [root@node0 ~]# tail -n 20 /var/log/glusterfs/nfs.log

> > > > >> [2014-10-18 07:41:06.136035] E [graph.c:307:glusterfs_graph_init]

> > > > >> 0-nfs-server: initializing translator failed

> > > > >> [2014-10-18 07:41:06.136040] E [graph.c:502:glusterfs_graph_activate]

> > > > >> 0-graph: init failed

> > > > >> pending frames:

> > > > >> frame : type(0) op(0)

> > > > >>

> > > > >> patchset: git://git.gluster.com/glusterfs.git

> > > > >> signal received: 11

> > > > >> time of crash: 2014-10-18 07:41:06configuration details:

> > > > >> argp 1

> > > > >> backtrace 1

> > > > >> dlfcn 1

> > > > >> fdatasync 1

> > > > >> libpthread 1

> > > > >> llistxattr 1

> > > > >> setfsid 1

> > > > >> spinlock 1

> > > > >> epoll.h 1

> > > > >> xattr.h 1

> > > > >> st_atim.tv_nsec 1

> > > > >> package-string: glusterfs 3.5.2

> > > > >

> > > > > This definitely is a gluster/nfs issue. For whatever reasone, the

> > > > > gluster/nfs server crashes :-/ The log does not show enough details,

> > > > > some more lines before this are needed.

> > > > >

> > > >

> > > > I wonder if the crash is due to a cleanup after the translator

> > > > initialization failure. The complete logs might help in understanding

> > > > why the initialization failed.

> > > >

> > > > -Vijay

> > > >

> > > >

> >

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users