Re: Experimental upgrade of a Cephadm-managed Squid cluster to Ubuntu Noble (walk-through and RFC)

Florian Haas <florian.haas@xxxxxxxxxx> · Fri, 20 Dec 2024 11:13:27 +0100

On 20/12/2024 09:16, Robert Sander wrote:
Hi Florian,

Am 12/18/24 um 16:18 schrieb Florian Haas:

To illustrate why, assume you've got 3 Mons in your cluster.

Now, on one of your physical hosts that runs a Mon, you enter
maintenance mode. This will just shut down the Mon. Now you proceed with
the system upgrade, which will vary in length. During that time you're
running on two Mons.

Now, something unexpected happens on another node that runs another Mon.
Boom, your cluster is now offline, and you need to scramble to fix things.

Yes, this is a risk. But shouldn't you run on 5 MONs today? At least
this seems to be the recommended number from the detail service spec.

The consensus heard at Cephalocon was still "for the vast majority of 
clusters you'll be fine with 3 MONs," so it is safe to assume that that 
is what most Ceph clusters out there are configured with today.

If conversely you set _no_schedule and you still have other hosts to
migrate your Mon to (per your placement policy), then you'll run on 3
Mons throughout.

Then maybe the maintenance mode should also set this label.

I think the semantics of maintenance mode are explicitly designed to be 
"shut down services on one node and *don't* move them around", which 
totally has merit in certain scenarios. Rolling OS upgrade on a cluster 
just isn't one of them, in my humble opinion.

Also, while maintenance does stop and disable the systemd ceph.target,
meaning the services won't come up even if the host is rebooted,
"systemctl status ceph.target" will still return "active" and "enabled"
which may break assumptions by monitoring systems, orchestration
frameworks, etc.

Your step 8 also just stops the ceph.target. Where is the difference?

If I run "systemctl stop ceph.target" on a node in the cluster where 
I've been testing this, "podman ps" is empty and 
"systemctl status ceph.target" returns "inactive (dead)". Likewise, 
"systemctl is-active ceph.target" returns "inactive" and exits 3.

If conversely I put a node in maintenance mode, 
"systemctl is-active ceph.target" returns "active" and exits 0, while 
"podman ps" likewise shows no running containers.

I do not know *why* that is the case and I'm afraid I don't an 
opportunity to dig into that now, but that's the current behaviour as I 
see it.

Cheers,
Florian

Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx