Re: Quorum and reboots

Lindsay Mathieson <lindsay.mathieson@xxxxxxxxx> · Fri, 11 Mar 2016 08:52:16 +1000

On 11/03/2016 2:24 AM, David Gossage wrote:
  It is file based not block based healing so it saw multi-GB files 
that it had to recopy over.  It had to halt all write to those files 
while that occurred or it would be a never ending cycle of re-copying 
the large images.  So the fact most VM's went haywire isnt that odd.  
It does look based on timing in alerts the 2 bricks that were up kept 
serving images until 3rd brick came back.  It did heal all images just 
fine.

What version are you running?  3.7.x has sharding (breaks large files 
into chunks) to allow much finer grained healing, it speeds up heals a 
*lot*. However it can't be applied retroactively, you have to enable 
sharding then copy the VM over :(

http://blog.gluster.org/2015/12/introducing-shard-translator/

In regards to rolling reboots, it can be done with replicated storage 
and gluster will transparently  hand over client read/writes, but for 
each VM image, only one copy at a time can be healing over wise access 
will be blocked as you saw.

So recommended procedure:
- Enable sharding
- copy VM's over
- when rebooting wait for heals to complete before rebooting the next node

nb: Thoroughly recommend 3 way replication as you have done, it saves a 
lot of headaches with quorums and split brain.

--
Lindsay Mathieson

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users