Re: Recovering from Arb/Quorum Write Locks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 5/28/2017 9:24 PM, Ravishankar N wrote:
Just to elaborate further, if all nodes were up to begin with and there were zero self-heals pending, and you only brought down only gluster2, writes must still be allowed. I guess in your case, there must be some pending heals from gluster2 to gluster1 before you brought gluster2 down due to a network disconnect from the fuse mount to gluster1.


OK, I was aggressively writing within and to those VMs all at the same time pulling cables (power and network). My initial observation was that the shards healed quickly, but perhaps that I may have gotten too aggressive didn't wait long enough between tests for the healing to kick-in and/or finish.

I will retest and pay attention to outstanding heals, both prior and during the tests.

I suppose I could fiddle with the quorum settings as above, but I'd like to be able to PAUSE/FLUSH/FSYNC the Volume before taking down Gluster2, then unpause and let the volume continue with Gluster1 and the ARB providing some sort of protection and to help when Gluster2 is returned to the cluster.


I think you should try to find if there were self-heals pending to gluster1 before you brought gluster2 down or the VMs should not have paused.

yes, I'll start look at heals PRIOR to yanking cables.

OK, can I assume SOME pause is expected when Gluster first sees gluster2 go down which would unpause after a timeout period. I have seen that behaviour as well.

-bill

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux