Re: 3.3.0 -> 3.4.2 / Rolling upgrades with no downtime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If you're not sure just pick one and save the other.

Steps I did:
save one of the qcow2 file split-brains (copy from brick to another name)
removed the qcow2 file that you just "backed up". Gluster will heal with the other one.
restart VM
if VM recorvers after a fsck then just delete the saved qcow2 file
    otherwise try the other.
If they are both messed up use other backup plans or rebuilt the VM.
    stuff happens



On Thu, Mar 6, 2014 at 1:50 AM, João Pagaime <joao.pagaime@xxxxxxxxx> wrote:
thanks

but which  qcow2/FVM file choose for deletion?  maybe  there is some known current best-practice for the VM maximum stability

if the VM is frozen the decision maybe to delete the oldest qcow2/FVM file, or random choose if there is no difference

best regards,
--joão




Em 05-03-2014 20:26, Bryan Whitehead escreveu:
(this can take too long because of using "find" if you have many files)

Right after Joe Julian released a pretty handy system for exploring the split brains and fixing. You can check it out here:
http://joejulian.name/blog/glusterfs-split-brain-recovery-made-easy/
https://github.com/joejulian/glusterfs-splitbrain



On Wed, Mar 5, 2014 at 1:40 AM, João Pagaime <joao.pagaime@xxxxxxxxx> wrote:
hello Bryan and thanks for sharing!

how did you fix those 2 files on a  split-brain situation? deleted one "bad" file?
which one to select for deletion?

on an software update situation I would expect not to have peer probe problems, simply because there are no "gluster peer probe" commands. The problem that could happen is the updated 3.4.2 node having problems reentering a  3.3.0 cluster (without "peer probes" commands). It's good news it went well

best regards
joao

Em 04-03-2014 18:30, Bryan Whitehead escreveu:
I just did this last week from 3.3.0->3.4.2.

I never got the peer probe problems - but I did end up with 2 files being in a split-brain situation.

Note: I only had ~hundred files that are qcow2 for KVM, so 2 files getting split-brain is about 2% filesystem problem.


On Tue, Mar 4, 2014 at 1:43 AM, João Pagaime <joao.pagaime@xxxxxxxxx> wrote:
Hello all

anyone tried a  rolling upgrades with no downtime [1] from 3.3.0 to 3.4.2 or similar upgrade? any comments?

for testing purposes we've installed a 3.4.2 server and it won't peer, giving the error "peer probe: failed: Peer X does not support required op-version".

I guess this is expected behavior for a new entry on the cluster

What about changing the software on an existing peer of the cluster? Will it also refuse to re-enter the cluster after the upgrade for the same reason (peers not supporting the required op-version)?

After all servers and clients are upgraded, how to increase the op-version of the global cluster?

best regards,
joão

[1]
http://vbellur.wordpress.com/2013/07/15/upgrading-to-glusterfs-3-4/
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux