Re: Cannot start Gluster -- resolve brick failed in restore

Atin Mukherjee <amukherj@xxxxxxxxxx> · Mon, 08 Jun 2015 14:55:56 +0530

On 06/08/2015 01:38 PM, shacky wrote:
> Hi.
> I have a GlusterFS cluster running on a Debian Wheezy with GlusterFS
> 3.6.2, with one volume on all three bricks (web1, web2, web3).
> All was working good until I changed the IP addresses of bricks,
> because after then only the GlusterFS daemon on web1 is starting well,
> and the deamons on web2 and web3 are exiting with these errors:
> 
> [2015-06-08 07:59:15.929330] I [MSGID: 100030]
> [glusterfsd.c:2018:main] 0-/usr/sbin/glusterd: Started running
> /usr/sbin/glusterd version 3.6.2 (args: /usr/sbin/glusterd -p
> /var/run/glusterd.pid)
> [2015-06-08 07:59:15.932417] I [glusterd.c:1214:init] 0-management:
> Maximum allowed open file descriptors set to 65536
> [2015-06-08 07:59:15.932482] I [glusterd.c:1259:init] 0-management:
> Using /var/lib/glusterd as working directory
> [2015-06-08 07:59:15.933772] W [rdma.c:4221:__gf_rdma_ctx_create]
> 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such
> device)
> [2015-06-08 07:59:15.933815] E [rdma.c:4519:init] 0-rdma.management:
> Failed to initialize IB Device
> [2015-06-08 07:59:15.933838] E
> [rpc-transport.c:333:rpc_transport_load] 0-rpc-transport: 'rdma'
> initialization failed
> [2015-06-08 07:59:15.933887] W [rpcsvc.c:1524:rpcsvc_transport_create]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2015-06-08 07:59:17.354500] I
> [glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd:
> retrieved op-version: 30600
> [2015-06-08 07:59:17.527377] I
> [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo]
> 0-management: connect returned 0
> [2015-06-08 07:59:17.527446] I
> [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo]
> 0-management: connect returned 0
> [2015-06-08 07:59:17.527499] I
> [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting
> frame-timeout to 600
> [2015-06-08 07:59:17.528139] I
> [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting
> frame-timeout to 600
> [2015-06-08 07:59:17.528861] E
> [glusterd-store.c:4244:glusterd_resolve_all_bricks] 0-glusterd:
> resolve brick failed in restore
> [2015-06-08 07:59:17.528891] E [xlator.c:425:xlator_init]
> 0-management: Initialization of volume 'management' failed, review
> your volfile again
> [2015-06-08 07:59:17.528906] E [graph.c:322:glusterfs_graph_init]
> 0-management: initializing translator failed
> [2015-06-08 07:59:17.528917] E [graph.c:525:glusterfs_graph_activate]
> 0-graph: init failed
> [2015-06-08 07:59:17.529257] W [glusterfsd.c:1194:cleanup_and_exit]
> (--> 0-: received signum (0), shutting down
> 
> Please note that bricks name are setted in /etc/hosts and all of them
> are resolving well with the new IP addresses, so I cannot find out
> where the problem is.
> 
> Could you help me please?

Here is what you can do on the nodes where glusterD fails to start:

1. cd /var/lib/glusterd
2. grep -irns "old ip"
The output will be similar like this :

vols/test-vol/info:20:brick-0=172.17.0.2:-tmp-b1
vols/test-vol/info:21:brick-1=172.17.0.2:-tmp-b2
vols/test-vol/test-vol.tcp-fuse.vol:6:    option remote-host 172.17.0.2
vols/test-vol/test-vol.tcp-fuse.vol:15:    option remote-host 172.17.0.2
vols/test-vol/trusted-test-vol.tcp-fuse.vol:8:    option remote-host
172.17.0.2
vols/test-vol/trusted-test-vol.tcp-fuse.vol:19:    option remote-host
172.17.0.2
vols/test-vol/test-vol-rebalance.vol:6:    option remote-host 172.17.0.2
vols/test-vol/test-vol-rebalance.vol:15:    option remote-host 172.17.0.2
vols/test-vol/bricks/172.17.0.1:-tmp-b1:1:hostname=172.17.0.2
vols/test-vol/bricks/172.17.0.1:-tmp-b2:1:hostname=172.17.0.2
nfs/nfs-server.vol:8:    option remote-host 172.17.0.2
nfs/nfs-server.vol:19:    option remote-host 172.17.0.2

3. find . * -exec sed -i "s/<old ip>/<new ip>/g" {} \;

4. You would need to manually rename few files (for eg : mv
vols/test-vol/bricks/172.17.0.1:-tmp-b1
vols/test-vol/bricks/172.17.0.2:-tmp-b1)

Do this exercise on all the failed nodes and recheck and let me know if
it works.

~Atin
> 
> Thank you very much!
> Bye
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
> 

-- 
~Atin
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users