Xavi, does that mean that even if every node was rebooted one at a time even without issuing a heal that the volume would have no issues after running gluster volume heal [volname] when all bricks are back online?
From: Xavi Hernandez <jahernan@xxxxxxxxxx>
Sent: Thursday, March 15, 2018 12:09:05 AM To: Victor T Cc: gluster-users@xxxxxxxxxxx Subject: Re: Disperse volume recovery and healing Hi Victor,
On Wed, Mar 14, 2018 at 12:30 AM, Victor T
<hero_of_nothing_1@xxxxxxxxxxx> wrote:
On a 4+2 configuration you could bring down up to 2 bricks simultaneously for maintenance. However if something happens to one of the remaining 4 bricks, the volume would stop working. So in this case I would recommend to not have more than one server
down for maintenance at the same time unless the down time is very very small.
Once the stopped servers come back up again, you need to wait until all files are healed before proceeding with the next server. Failing to do so means that some files could have more than 2 non-healthy versions, what will make the file inaccessible until
enough healthy versions are available again.
Self-heal should be automatically triggered once the bricks come online, however there was a bug (https://bugzilla.redhat.com/show_bug.cgi?id=1547662)
that could cause delays in the self-heal process. This bug should be fixed in the next version. Meantime you can force self-heal to progress by issuing "gluster volume heal <volname>" commands each time it seems to have stopped.
Once the output of "gluster volume heal <volname> info" reports 0 pending files on all bricks, you can proceed with the maintenance of the next server.
No need to do any rebalance for down bricks. Rebalance is basically needed when volume is expanded with more bricks.
Xavi
|
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users