Re: After upgrade from 3.5 to 3.7 gluster local NFS is not starting on one of the servers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



any ideas?

2015-09-13 20:41 GMT+08:00 Yaroslav Molochko <onorua@xxxxxxxxx>:
So, I've done:
root@PSC01SERV008:/var/log# tail -f syslog | grep -Ev 'docker|kubelet|kube-proxy'
Sep 13 12:18:16 psc01serv008 systemd[1]: Stopped GlusterFS an clustered file-system server.
Sep 13 12:19:21 psc01serv008 systemd[1]: Reloading.
Sep 13 12:19:35 psc01serv008 systemd[1]: message repeated 3 times: [ Reloading.]
Sep 13 12:20:10 psc01serv008 systemd[1]: Starting GlusterFS an clustered file-system server...
Sep 13 12:20:12 psc01serv008 systemd[1]: Started GlusterFS an clustered file-system server.

and stopped glusterfs, it said it was stopped but processes where there, and I killed them manually, maybe something wrong with the system unit file, but it was working with 3.5, so don't know. Then I disabled my "homemade" glusterfs service and enabled /etc/init.d/glusterfs-server and it got the same problem, I could not restart the glusterfs processes from "init" doesn't matter what init file I try.
And when I kill processes by hands, it starts up but, as you can see there is no reports of any problems with starting up the NFS or any blocking port. There is no firewalld running on my host, and the problem is that I have 2 hosts identical to peer with, one is working and one is not.
dmesg is attached as well as my "handmade" glusterfs systemd service, just in case I start it wrongly.

2015-09-13 19:00 GMT+08:00 Soumya Koduri <skoduri@xxxxxxxxxx>:


On 09/13/2015 09:38 AM, Yaroslav Molochko wrote:
I wish this could be that simple:
root@PSC01SERV008:/var/lib# netstat -nap | grep 38465
root@PSC01SERV008:/var/lib# ss -n  | grep 38465
root@PSC01SERV008:/var/lib#

2015-09-13 1:34 GMT+08:00 Atin Mukherjee <atin.mukherjee83@xxxxxxxxx
<mailto:atin.mukherjee83@xxxxxxxxx>>:

    By any chance is your Gluster NFS server is already running? Output
    of netstat -nap | grep 38465 might give some clue?

    -Atin
    Sent from one plus one

    On Sep 12, 2015 10:54 PM, "Yaroslav Molochko" <onorua@xxxxxxxxx
    <mailto:onorua@xxxxxxxxx>> wrote:

        Hello,

        I have a problem reported in logs:
        ==================
        [2015-09-12 13:56:06.271644] I [MSGID: 100030]
        [glusterfsd.c:2301:main] 0-/usr/sbin/glusterfs: Started running
        /usr/sbin/glusterfs version 3.7.4 (args: /usr/sbin/glusterfs -s
        localhost --volfile-id gluster/nfs -p
        /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log
        -S /var/run/gluster/cb186678589f28e74c67da70fd06e736.socket)
        [2015-09-12 13:56:06.277921] I [MSGID: 101190]
        [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
        thread with index 1
        [2015-09-12 13:56:07.284888] I
        [rpcsvc.c:2215:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service:
        Configured rpc.outstanding-rpc-limit with value 16
        [2015-09-12 13:56:07.292484] W [MSGID: 112153]
        [mount3.c:3910:mnt3svc_init] 0-nfs-mount: Exports auth has been
        disabled!
        [2015-09-12 13:56:07.294357] E
        [rpcsvc.c:1370:rpcsvc_program_register_portmap] 0-rpc-service:
        Could not register with portmap 100005 3 38465

Port registration failed. Could you check '/var/log/messages' and dmesg to see if there are any errors logged? Is firewalld running on your system. Verify if the port is open to be used.

Thanks,
Soumya
        [2015-09-12 13:56:07.294398] E [MSGID: 112088]
        [nfs.c:341:nfs_init_versions] 0-nfs: Required program  MOUNT3
        registration failed
        [2015-09-12 13:56:07.294413] E [MSGID: 112109] [nfs.c:1482:init]
        0-nfs: Failed to initialize protocols
        [2015-09-12 13:56:07.294426] E [MSGID: 101019]
        [xlator.c:428:xlator_init] 0-nfs-server: Initialization of
        volume 'nfs-server' failed, review your volfile again
        [2015-09-12 13:56:07.294438] E
        [graph.c:322:glusterfs_graph_init] 0-nfs-server: initializing
        translator failed
        [2015-09-12 13:56:07.294448] E
        [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
        [2015-09-12 13:56:07.294781] W
        [glusterfsd.c:1219:cleanup_and_exit]
        (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x11a) [0x7fbe9c754b7a]
        -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x123)
        [0x7fbe9c74fcb3] -->/usr/sbin/glusterfs(cleanup_and_exit+0x59)
        [0x7fbe9c74f329] ) 0-: received signum (0), shutting down
        ===================

        I've checked the page:
        http://www.gluster.org/community/documentation/index.php/Gluster_3.1:_NFS_Frequently_Asked_Questions

        I've found report in RedHat that it's necessary to remove -w
        from rpcbind becuse some times it causes problems.
        I did all that but still no luck on one of the servers, what is
        interesting, the other server (peered) is working fine without
        any problems.

        root@PSC01SERV008:/var/lib/glusterd/nfs# systemctl status nfs
        ● nfs.service
            Loaded: not-found (Reason: No such file or directory)
            Active: inactive (dead)

        root@PSC01SERV008:/var/lib/glusterd/nfs# systemctl status rpcbind
        ● rpcbind.service - RPC bind portmap service
            Loaded: loaded (/etc/systemd/system/rpcbind.service;
        enabled; vendor preset: enabled)
           Drop-In: /run/systemd/generator/rpcbind.service.d
                    └─50-rpcbind-$portmap.conf
            Active: active (running) since Sat 2015-09-12 13:55:07 UTC;
        6min ago
          Main PID: 9796 (rpcbind)
            Memory: 428.0K
            CGroup: /system.slice/rpcbind.service
                    └─9796 /sbin/rpcbind

        Sep 12 13:55:07 PSC01SERV008 systemd[1]: Starting RPC bind
        portmap service...
        Sep 12 13:55:07 PSC01SERV008 systemd[1]: Started RPC bind
        portmap service.
        root@PSC01SERV008:/var/lib/glusterd/nfs# rpcinfo -p
            program vers proto   port  service
             100000    4   tcp    111  portmapper
             100000    3   tcp    111  portmapper
             100000    2   tcp    111  portmapper
             100000    4   udp    111  portmapper
             100000    3   udp    111  portmapper
             100000    2   udp    111  portmapper

        I've tried to reinstall it agaon and again - but there is no luck.

        What I have:
        cat /etc/lsb-release
        DISTRIB_ID=Ubuntu
        DISTRIB_RELEASE=15.04
        DISTRIB_CODENAME=vivid
        DISTRIB_DESCRIPTION="Ubuntu 15.04"

        ii  glusterfs-client
        3.7.4-ubuntu1~vivid1              amd64        clustered
        file-system (client package)
        ii  glusterfs-common
        3.7.4-ubuntu1~vivid1              amd64        GlusterFS common
        libraries and translator modules
        ii  glusterfs-server
        3.7.4-ubuntu1~vivid1              amd64        clustered
        file-system (server package)

        What else can I check? How can I fix it, what is most important :)
        Thanks in advance!

        _______________________________________________
        Gluster-users mailing list
        Gluster-users@xxxxxxxxxxx <mailto:Gluster-users@xxxxxxxxxxx>
        http://www.gluster.org/mailman/listinfo/gluster-users




_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux