Problem in replicating an existing gluster volume from single brick setup to two brick setup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greetings,

 

I have a problem in replicating an existing gluster volume from single brick setup to two brick setup. Background of the problem is following:

 

OS: Ubuntu 14.04

Gluster version (from gluster repos): glusterfs 3.7.14 built on Aug  1 2016 16:57:28

 

1. I had a replication setup consisting of two Gluster bricks (srv100, srv102), and three volumes (gv0, gv100).

2. I had to completely rebuild raid/disks of one of the bricks (srv100) due to hardware failure. I did it by doing following on the faulty node:

2.1 Removed the failed brick from replication setup (reduced replica count to 1 from 2, and detached the node). I executed following commands on the *good* brick.

              sudo gluster volume remove-brick gv100 replica 1 srv100:/pool01/gfs/brick1/gv100 force

             sudo gluster volume remove-brick gv0 replica 1 srv100:/pool01/gfs/brick1/gv0 force

              sudo gluster vol info #make sure the faulty node bricks are not listed, and brick count is 1 for each volume

              sudo gluster peer detach srv100 force

              sudo gluster peer status # --> OK, only one node/brick

 

2.2 Stopped glusterd, killed all gluster processes

2.3 Replaced HDs, and recreated raid. This means all GlusterFS data relevant directories were lost on the faulty-brick (srv100), while GlusterFS service installation and config files were untouched (including host name and IP address).

2.4 After rebuilding, I created volume directories on the rebuilt-node

2.5 Then I started gluster service, and added the node back to gluster cluster. Peer status is ok (in cluster)

 

2.6 Then I attempted to replicate one of the existing volume (gv0), and *there* came the problem. The replication could not be setup properly, and gave following error

           sudo gluster volume add-brick gv0 replica 2 srv100:/pool01/gfs/brick1/gv0

                 volume add-brick: failed: Staging failed on srv100. Please check log file for details.

 

    The relevant gluster log file says

 

[2016-08-25 12:32:29.499708] I [MSGID: 106499] [glusterd-handler.c:4267:__glusterd_handle_status_volume] 0-management: Received status volume req for volume gv-temp

[2016-08-25 12:32:29.501881] E [MSGID: 106301] [glusterd-syncop.c:1274:gd_stage_op_phase] 0-management: Staging of operation 'Volume Status' failed on localhost : Volume gv-temp is not started

[2016-08-25 12:32:29.505033] I [MSGID: 106499] [glusterd-handler.c:4267:__glusterd_handle_status_volume] 0-management: Received status volume req for volume gv0

[2016-08-25 12:32:29.508585] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on srv100. Error: Volume gv0 does not exist

[2016-08-25 12:32:29.511062] I [MSGID: 106499] [glusterd-handler.c:4267:__glusterd_handle_status_volume] 0-management: Received status volume req for volume gv100

[2016-08-25 12:32:29.514556] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on srv100. Error: Volume gv100 does not exist

[2016-08-25 12:33:15.865773] I [MSGID: 106499] [glusterd-handler.c:4267:__glusterd_handle_status_volume] 0-management: Received status volume req for volume gv0

[2016-08-25 12:33:15.869441] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on srv100. Error: Volume gv0 does not exist

[2016-08-25 12:33:15.872630] I [MSGID: 106499] [glusterd-handler.c:4267:__glusterd_handle_status_volume] 0-management: Received status volume req for volume gv100

[2016-08-25 12:33:15.876199] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on srv100. Error: Volume gv100 does not exist

[2016-08-25 12:34:14.716735] I [MSGID: 106482] [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] 0-management: Received add brick req

[2016-08-25 12:34:14.716787] I [MSGID: 106062] [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] 0-management: replica-count is 2

[2016-08-25 12:34:14.716809] I [MSGID: 106447] [glusterd-brick-ops.c:240:gd_addbr_validate_replica_count] 0-management: Changing the type of volume gv0 from 'distribute' to 'replica'

[2016-08-25 12:34:14.720133] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on srv100. Please check log file for details.

 

3. I tried to create a new replicated volume (gv-temp) over the nodes à it is created and replicated. It is only that the existing volume I cannot replicate again!

4. I also observed that /var/lib/glusterd/vols directory on the rebuilt node contains directory for the newly created volume (gv-temp), and no existing volumes (gv100, gv0)

 

 

*Questions:* 

a. How to re-replicate the exiting volume, for which I set the replica count to 1 (see point 2.1)?

b. Is there a “glusterfs” way to create missing volume directories (under /var/lib/glusterd/vols) on the re-built node (see point 4)?

c. Any other pointers, hints?

 

Thanks.

 

Kind regards,

JAsghar

 

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux