Re: Experimental upgrade of a Cephadm-managed Squid cluster to Ubuntu Noble (walk-through and RFC)

Redouane Kachach <rkachach@xxxxxxxxxx> · Thu, 2 Jan 2025 16:37:11 +0100

Just to comment on the ceph.target. Technically in a containerized ceph a
node can host daemons from *many ceph clusters* (each with its own
ceph_fsid).

The ceph.target is a global unit and it's the root for all the clusters
running in the node. There's another target which is specific to
each cluster (ceph-<fsid>.target). From my testing env where I created two
clusters and I forced maintenance mode for the first one only:

[root@ceph-node-2 ~]# systemctl list-dependencies ceph.target
ceph.target
○ ├─ceph-789c5638-bec0-11ef-9350-5254002ff0d8.target
○ │
├─ceph-789c5638-bec0-11ef-9350-5254002ff0d8@xxxxxxxxxxxxxxxxxx-node-2.service
○ │ ├─ceph-789c5638-bec0-11ef-9350-5254002ff0d8@xxxxxxxxxx-node-2.service
× │
├─ceph-789c5638-bec0-11ef-9350-5254002ff0d8@xxxxxxxx-node-2.ptlcoi.service
○ │ ├─ceph-789c5638-bec0-11ef-9350-5254002ff0d8@xxxxxxxx-node-2.service
× │
└─ceph-789c5638-bec0-11ef-9350-5254002ff0d8@xxxxxxxxxxxxxxxxxx-node-2.service
● └─ceph-a3cf42a0-becc-11ef-9470-52540012a496.target
●
├─ceph-a3cf42a0-becc-11ef-9470-52540012a496@xxxxxxxxxxxxxxxxxx-node-2.service
●   ├─ceph-a3cf42a0-becc-11ef-9470-52540012a496@xxxxxxxxxx-node-2.service
●
├─ceph-a3cf42a0-becc-11ef-9470-52540012a496@xxxxxxxx-node-2.bodyuz.service
●   ├─ceph-a3cf42a0-becc-11ef-9470-52540012a496@xxxxxxxx-node-2.service
●
└─ceph-a3cf42a0-becc-11ef-9470-52540012a496@xxxxxxxxxxxxxxxxxx-node-2.service

*Global target:*
[root@ceph-node-2 ~]# systemctl is-active ceph.target
active

*First cluster:*
> systemctl is-active ceph-789c5638-bec0-11ef-9350-5254002ff0d8.target
inactive

*Second cluster:*
> systemctl is-active ceph-a3cf42a0-becc-11ef-9470-52540012a496.target
active

Best,
Redouane.

On Fri, Dec 20, 2024 at 11:14 AM Florian Haas <florian.haas@xxxxxxxxxx>
wrote:

> On 20/12/2024 09:16, Robert Sander wrote:
> > Hi Florian,
> >
> > Am 12/18/24 um 16:18 schrieb Florian Haas:
> >
> >> To illustrate why, assume you've got 3 Mons in your cluster.
> >>
> >> Now, on one of your physical hosts that runs a Mon, you enter
> >> maintenance mode. This will just shut down the Mon. Now you proceed with
> >> the system upgrade, which will vary in length. During that time you're
> >> running on two Mons.
> >>
> >> Now, something unexpected happens on another node that runs another Mon.
> >> Boom, your cluster is now offline, and you need to scramble to fix
> things.
> >
> > Yes, this is a risk. But shouldn't you run on 5 MONs today? At least
> > this seems to be the recommended number from the detail service spec.
>
> The consensus heard at Cephalocon was still "for the vast majority of
> clusters you'll be fine with 3 MONs," so it is safe to assume that that
> is what most Ceph clusters out there are configured with today.
>
> >> If conversely you set _no_schedule and you still have other hosts to
> >> migrate your Mon to (per your placement policy), then you'll run on 3
> >> Mons throughout.
> >
> > Then maybe the maintenance mode should also set this label.
>
> I think the semantics of maintenance mode are explicitly designed to be
> "shut down services on one node and *don't* move them around", which
> totally has merit in certain scenarios. Rolling OS upgrade on a cluster
> just isn't one of them, in my humble opinion.
>
> >> Also, while maintenance does stop and disable the systemd ceph.target,
> >> meaning the services won't come up even if the host is rebooted,
> >> "systemctl status ceph.target" will still return "active" and "enabled"
> >> which may break assumptions by monitoring systems, orchestration
> >> frameworks, etc.
> >
> > Your step 8 also just stops the ceph.target. Where is the difference?
>
> If I run "systemctl stop ceph.target" on a node in the cluster where
> I've been testing this, "podman ps" is empty and
> "systemctl status ceph.target" returns "inactive (dead)". Likewise,
> "systemctl is-active ceph.target" returns "inactive" and exits 3.
>
> If conversely I put a node in maintenance mode,
> "systemctl is-active ceph.target" returns "active" and exits 0, while
> "podman ps" likewise shows no running containers.
>
> I do not know *why* that is the case and I'm afraid I don't an
> opportunity to dig into that now, but that's the current behaviour as I
> see it.
>
> Cheers,
> Florian
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx