Re: No rolling updates from v0.56 to v0.60+?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/18/2013 05:28 PM, Gregory Farnum wrote:
On Wed, Apr 17, 2013 at 7:40 AM, Guido Winkelmann
<guido@xxxxxxxxxxxxxxxxx> wrote:
Hi,

I just tried upgrading parts of our experimental ceph cluster from 0.56.1 to
0.60, and it looks like the new mon-daemon from 0.60 cannot talk to those from
0.56.1 at all.

Long story short, we had to move some hardware around and during that time I
had to shrink the cluster to one single machine. My plan was to expand it to
three machines again, so that I would again have 3 mons and 3 osds, as before.
I just installed the first new machine, going straight for 0.60, but leaving
the remaining old one at 0.56.1. I added the new mon to the mon map according
to the documentation and started the new mon daemon, but the mon-cluster
wouldn't achieve quorum. In the logs for the new mon, I saw the following line
repeated a lot:

0 -- 10.6.224.129:6789/0 >> 10.6.224.131:6789/0 pipe(0x2da5ec0 sd=20 :37863
s=1 pgs=0 cs=0 l=0).connect protocol version mismatch, my 10 != 9

The old mon had no such lines in its log.

I could only solve this by shutting down the old mon and upgrading it to 0.60
as well.

It looks to me like this means rolling upgrades without downtime won't be
possible from bobtail to cuttlefish. Is that correct?

If the cluster is in good shape, this shouldn't actually result in
downtime. Do a rolling upgrade of your monitors, and then when a
majority of them are on Cuttlefish they'll switch over to form the
quorum — the "downtime" being the period a store requires to update,
which shouldn't be long, and it will only be the monitors that are
inaccessible (unless it takes a truly ridiculous time for the
upgrade). All the rest of the daemons you can do rolling upgrades on
just the same as before.

Another potential source of delay would be the synchronization process triggered when a majority of monitors have been upgraded.

Say you have 5 monitors.

You upgrade two while the cluster is happily running: the stores are converted, which may take longer if the store is huge [1], but you get your monitors ready to join the quorum as soon as a third member is upgraded.

During this time, your cluster kept on going, with more versions being created.

And then you decide to upgrade the third monitor. It will go through the same period of downtime as the other two monitors -- which as Greg said shouldn't be long, but may be if your stores are huge [1] -- and this will be the bulk of your downtime.

However, as the cluster kept on going, there's a chance that the first two monitors to be upgraded will have fallen out of sync with the more recent cluster state. That will trigger a store sync, which shouldn't take long either, but this is somewhat bound by the store size and the amount of versions that were created in-between. You might even be lucky enough, and go through with the whole thing in no time and the sync might not even be necessary (there's another mechanism to handle catch-up when the monitors haven't drifted that much).

Anyway, when you are finally upgrading the third monitor (out of 5), that is going to break quorum, so it would probably be wise to just upgrade the remaining monitors all at once.


[1] - With the new leveldb tuning this might not even be an issue.

  -Joao


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux