glusterd service fails to start on one peer

mark at gina.alaska.edu (Mark Morlino) · Mon, 7 Oct 2013 12:44:52 -0800

So, I guess I figured it out. I had been looking for a volume problem based
on the log messages but it turns out it was a peer definition problem. One
of the files in /var/lib/glusterd/peers was empty. I was able to determine
where to look based on the output of running /usr/sbin/glusterd --debug
--pid-file=/var/run/glusterd.pid and then I was able to copy the missing
file from one of the other peers since each peer has a file for each of the
other 2 peers.

On Mon, Oct 7, 2013 at 12:11 PM, Mark Morlino <mark at gina.alaska.edu> wrote:

> I'm hoping that someone here can point me the right direction to help me
> solve a problem I am having.
>
> I've got 3 gluster peers and for some reason glusterd sill not start on
> one of them. All are running glusterfs version 3.4.0-8.el6 on Centos 6.4
> (2.6.32-358.el6.x86_64).
>
> In /var/log/glusterfs/etc-glusterfs-glusterd.vol.log I see this error
> repeated 36 times (alternating between brick-0 and brick-1):
>
> *E [glusterd-store.c:1845:glusterd_store_retrieve_volume] 0-: Unknown
>> key: brick-0*
>
>
> This makes some sense to me since I have 18 replica 2 volumes resulting in
> a total of 36 bricks.
>
> Then there are a few more "I" messages and this is the rest of the file:
>
> *E [glusterd-store.c:2472:glusterd_resolve_all_bricks] 0-glusterd:
>> resolve brick failed in restore
>> **E [xlator.c:390:xlator_init] 0-management: Initialization of volume
>> 'management' failed, review your volfile again
>> **E [graph.c:292:glusterfs_graph_init] 0-management: initializing
>> translator failed
>> **E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed
>> **W [glusterfsd.c:1002:cleanup_and_exit]
>> (-->/usr/sbin/glusterd(main+0x5d2) [0x406802]
>> (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xb7) [0x4051b7]
>> (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x103) [0x4050c3]))) 0-:
>> received signum (0), shutting down*
>
>
>  Here are the contents of /etc/glusterfs/glusterd.vol:
>
> *volume management
>> **    type mgmt/glusterd
>> **    option working-directory /var/lib/glusterd
>> **    option transport-type socket,rdma
>> **    option transport.socket.keepalive-time 10
>> **    option transport.socket.keepalive-interval 2
>> **    option transport.socket.read-fail-log off
>> **end-volume*
>
>
> glusterd.vol is the same on all of the peers and the other ones work.
>
> Any help on where to look next would be greatly appreciated.
>
> Thanks,
> Mark
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131007/cf8283a7/attachment.html>