Replace brick commit force command can be used. If you are on glusterfs 3.7.3 and above, self-heal will be automatically triggered from good bricks to the newly added brick. But you can't replace a brick on the same path as before, your new brick path will have to be different that the existing ones in the volume. ----- Original Message ----- > From: "Mahdi Adnan" <mahdi.adnan@xxxxxxxxxxx> > To: "Andres E. Moya" <amoya@xxxxxxxxxxxxxxxxx>, "gluster-users" <gluster-users@xxxxxxxxxxx> > Sent: Thursday, August 4, 2016 1:25:59 AM > Subject: Re: Failed file system > > Hi, > > I'm not expert in Gluster but, i think it would be better to replace the > downed brick with a new one. > Maybe start from here; > > https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Managing%20Volumes/#replace-brick > > > -- > > Respectfully > Mahdi A. Mahdi > > > > > Date: Wed, 3 Aug 2016 15:39:35 -0400 > From: amoya@xxxxxxxxxxxxxxxxx > To: gluster-users@xxxxxxxxxxx > Subject: Re: Failed file system > > Does anyone else have input? > > we are currently only running off 1 node and one node is offline in replicate > brick. > > we are not experiencing any downtime because the 1 node is up. > > I do not understand which is the best way to bring up a second node. > > Do we just re create a file system on the node that is down and the mount > points and allow gluster to heal( my concern with this is whether the node > that is down will some how take precedence and wipe out the data on the > healthy node instead of vice versa) > > Or do we fully wipe out the config on the node that is down, re create the > file system and re add the node that is down into gluster using the add > brick command replica 3, and then wait for it to heal then run the remove > brick command for the failed brick > > which would be the safest and easiest to accomplish > > thanks for any input > > > > > From: "Leno Vo" <lenovolastname@xxxxxxxxx> > To: "Andres E. Moya" <amoya@xxxxxxxxxxxxxxxxx> > Cc: "gluster-users" <gluster-users@xxxxxxxxxxx> > Sent: Tuesday, August 2, 2016 6:45:27 PM > Subject: Re: Failed file system > > if you don't want any downtime (in the case that your node 2 really die), you > have to create a new gluster san (if you have the resources of course, 3 > nodes as much as possible this time), and then just migrate your vms (or > files), therefore no downtime but you have to cross your finger that the > only node will not die too... also without sharding the vm migration > especially an rdp one, will be slow access from users till it migrated. > > you have to start testing sharding, it's fast and cool... > > > > > On Tuesday, August 2, 2016 2:51 PM, Andres E. Moya <amoya@xxxxxxxxxxxxxxxxx> > wrote: > > > couldnt we just add a new server by > > gluster peer probe > gluster volume add-brick replica 3 (will this command succeed with 1 current > failed brick?) > > let it heal, then > > gluster volume remove remove-brick > > From: "Leno Vo" <lenovolastname@xxxxxxxxx> > To: "Andres E. Moya" <amoya@xxxxxxxxxxxxxxxxx>, "gluster-users" > <gluster-users@xxxxxxxxxxx> > Sent: Tuesday, August 2, 2016 1:26:42 PM > Subject: Re: Failed file system > > you need to have a downtime to recreate the second node, two nodes is > actually not good for production and you should have put raid 1 or raid 5 as > your gluster storage, when you recreate the second node you might try > running some VMs that need to be up and rest of vm need to be down but stop > all backup and if you have replication, stop it too. if you have 1G nic, > 2cpu and less 8Gram, then i suggest all turn off the VMs during recreation > of second node. someone said if you have sharding with 3.7.x, maybe some vip > vm can be up... > > if it just a filesystem, then just turn off the backup service until you > recreate the second node. depending on your resources and how big is your > storage, it might be hours to recreate it and even days... > > here's my process on recreating the second or third node (copied and modifed > from the net), > > #make sure partition is already added!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > This procedure is for replacing a failed server, IF your newly installed > server has the same hostname as the failed one: > > (If your new server will have a different hostname, see this article > instead.) > > For purposes of this example, the server that crashed will be server3 and the > other servers will be server1 and server2 > > On both server1 and server2, make sure hostname server3 resolves to the > correct IP address of the new replacement server. > #On either server1 or server2, do > grep server3 /var/lib/glusterd/peers/* > > This will return a uuid followed by ":hostname1=server3" > > #On server3, make sure glusterd is stopped, then do > echo UUID={uuid from previous step}>/var/lib/glusterd/glusterd.info > > #actual testing below, > [root@node1 ~]# cat /var/lib/glusterd/glusterd.info > UUID=4b9d153c-5958-4dbe-8f91-7b5002882aac > operating-version=30710 > #the second line is new......... maybe not needed... > > On server3: > make sure that all brick directories are created/mounted > start glusterd > peer probe one of the existing servers > > #restart glusterd, check that full peer list has been populated using > gluster peer status > > (if peers are missing, probe them explicitly, then restart glusterd again) > #check that full volume configuration has been populated using > gluster volume info > > if volume configuration is missing, do > #on the other node > gluster volume sync "replace-node" all > > #on the node to be replaced > setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id > /var/lib/glusterd/vols/v1/info | cut -d= -f2 | sed 's/-//g') /gfs/b1/v1 > setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id > /var/lib/glusterd/vols/v2/info | cut -d= -f2 | sed 's/-//g') /gfs/b2/v2 > setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id > /var/lib/glusterd/vols/config/info | cut -d= -f2 | sed 's/-//g') > /gfs/b1/config/c1 > > mount -t glusterfs localhost:config /data/data1 > > #install ctdb if not yet installed and put it back online, use the step on > creating the ctdb config but > #use your common sense not to deleted or modify current one. > > gluster vol heal v1 full > gluster vol heal v2 full > gluster vol heal config full > > > > On Tuesday, August 2, 2016 11:57 AM, Andres E. Moya <amoya@xxxxxxxxxxxxxxxxx> > wrote: > > > Hi, we have a 2 node replica setup > on 1 of the nodes the file system that had the brick on it failed, not the OS > can we re create a file system and mount the bricks on the same mount point > > what will happen, will the data from the other node sync over, or will the > failed node wipe out the data on the other mode? > > what would be the correct process? > > Thanks in advance for any help > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > _______________________________________________ Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users -- Thanks, Anuradha. _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users