On 04/18/2013 05:28 PM, Gregory Farnum wrote:
On Wed, Apr 17, 2013 at 7:40 AM, Guido Winkelmann
<guido@xxxxxxxxxxxxxxxxx> wrote:
Hi,
I just tried upgrading parts of our experimental ceph cluster from 0.56.1 to
0.60, and it looks like the new mon-daemon from 0.60 cannot talk to those from
0.56.1 at all.
Long story short, we had to move some hardware around and during that time I
had to shrink the cluster to one single machine. My plan was to expand it to
three machines again, so that I would again have 3 mons and 3 osds, as before.
I just installed the first new machine, going straight for 0.60, but leaving
the remaining old one at 0.56.1. I added the new mon to the mon map according
to the documentation and started the new mon daemon, but the mon-cluster
wouldn't achieve quorum. In the logs for the new mon, I saw the following line
repeated a lot:
0 -- 10.6.224.129:6789/0 >> 10.6.224.131:6789/0 pipe(0x2da5ec0 sd=20 :37863
s=1 pgs=0 cs=0 l=0).connect protocol version mismatch, my 10 != 9
The old mon had no such lines in its log.
I could only solve this by shutting down the old mon and upgrading it to 0.60
as well.
It looks to me like this means rolling upgrades without downtime won't be
possible from bobtail to cuttlefish. Is that correct?
If the cluster is in good shape, this shouldn't actually result in
downtime. Do a rolling upgrade of your monitors, and then when a
majority of them are on Cuttlefish they'll switch over to form the
quorum — the "downtime" being the period a store requires to update,
which shouldn't be long, and it will only be the monitors that are
inaccessible (unless it takes a truly ridiculous time for the
upgrade). All the rest of the daemons you can do rolling upgrades on
just the same as before.
Another potential source of delay would be the synchronization process
triggered when a majority of monitors have been upgraded.
Say you have 5 monitors.
You upgrade two while the cluster is happily running: the stores are
converted, which may take longer if the store is huge [1], but you get
your monitors ready to join the quorum as soon as a third member is
upgraded.
During this time, your cluster kept on going, with more versions being
created.
And then you decide to upgrade the third monitor. It will go through
the same period of downtime as the other two monitors -- which as Greg
said shouldn't be long, but may be if your stores are huge [1] -- and
this will be the bulk of your downtime.
However, as the cluster kept on going, there's a chance that the first
two monitors to be upgraded will have fallen out of sync with the more
recent cluster state. That will trigger a store sync, which shouldn't
take long either, but this is somewhat bound by the store size and the
amount of versions that were created in-between. You might even be
lucky enough, and go through with the whole thing in no time and the
sync might not even be necessary (there's another mechanism to handle
catch-up when the monitors haven't drifted that much).
Anyway, when you are finally upgrading the third monitor (out of 5),
that is going to break quorum, so it would probably be wise to just
upgrade the remaining monitors all at once.
[1] - With the new leveldb tuning this might not even be an issue.
-Joao
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com