Re: Upgrade tips from Luminous to Nautilus?

Nico Schottelius <nico.schottelius@xxxxxxxxxxx> · Thu, 29 Apr 2021 11:08:37 +0200

I believe it was nautilus that started requiring

ms_bind_ipv4 = false
ms_bind_ipv6 = true

if you run IPv6 only clusters. OSDs prior to nautilus worked without
these settings for us.

I'm not sure if the port change (v1->v2) was part of luminous->nautilus
as well, but you might want to check your firewalling (if any).

Overall I recall luminous->nautilus a bit more rocky than usual
(compared to the previous releases), but nothing too serious.

Cheers,

Nico

Mark Schouten <mark@xxxxxxxx> writes:

> Hi,
>
> We've done our fair share of Ceph cluster upgrades since Hammer, and
> have not seen much problems with them. I'm now at the point that I have
> to upgrade a rather large cluster running Luminous and I would like to
> hear from other users if they have experiences with issues I can expect
> so that I can anticipate on them beforehand.
>
> As said, the cluster is running Luminous (12.2.13) and has the following
> services active:
>   services:
>     mon: 3 daemons, quorum osdnode01,osdnode02,osdnode04
>     mgr: osdnode01(active), standbys: osdnode02, osdnode03
>     mds: pmrb-3/3/3 up {0=osdnode06=up:active,1=osdnode08=up:active,2=osdnode07=up:active}, 1 up:standby
>     osd: 116 osds: 116 up, 116 in;
>     rgw: 3 daemons active
>
>
> Of the OSD's, we have 11 SSD's and 105 HDD. The capacity of the cluster
> is 1.01PiB.
>
> We have 2 active crush-rules on 18 pools. All pools have a size of 3 there is a total of 5760 pgs.
>     {
>         "rule_id": 1,
>         "rule_name": "hdd-data",
>         "ruleset": 1,
>         "type": 1,
>         "min_size": 1,
>         "max_size": 10,
>         "steps": [
>             {
>                 "op": "take",
>                 "item": -10,
>                 "item_name": "default~hdd"
>             },
>             {
>                 "op": "chooseleaf_firstn",
>                 "num": 0,
>                 "type": "host"
>             },
>             {
>                 "op": "emit"
>             }
>         ]
>     },
>     {
>         "rule_id": 2,
>         "rule_name": "ssd-data",
>         "ruleset": 2,
>         "type": 1,
>         "min_size": 1,
>         "max_size": 10,
>         "steps": [
>             {
>                 "op": "take",
>                 "item": -21,
>                 "item_name": "default~ssd"
>             },
>             {
>                 "op": "chooseleaf_firstn",
>                 "num": 0,
>                 "type": "host"
>             },
>             {
>                 "op": "emit"
>             }
>         ]
>     }
>
> rbd -> crush_rule: hdd-data
> .rgw.root -> crush_rule: hdd-data
> default.rgw.control -> crush_rule: hdd-data
> default.rgw.data.root -> crush_rule: ssd-data
> default.rgw.gc -> crush_rule: ssd-data
> default.rgw.log -> crush_rule: ssd-data
> default.rgw.users.uid -> crush_rule: hdd-data
> default.rgw.usage -> crush_rule: ssd-data
> default.rgw.users.email -> crush_rule: hdd-data
> default.rgw.users.keys -> crush_rule: hdd-data
> default.rgw.meta -> crush_rule: hdd-data
> default.rgw.buckets.index -> crush_rule: ssd-data
> default.rgw.buckets.data -> crush_rule: hdd-data
> default.rgw.users.swift -> crush_rule: hdd-data
> default.rgw.buckets.non-ec -> crush_rule: ssd-data
> DB0475 -> crush_rule: hdd-data
> cephfs_pmrb_data -> crush_rule: hdd-data
> cephfs_pmrb_metadata -> crush_rule: ssd-data
>
>
> All but four clients are running Luminous, the four are running Jewel
> (that needs upgrading before proceeding with this upgrade).
>
> So, normally, I would 'just' upgrade all Ceph packages on the
> monitor-nodes and restart mons and then mgrs.
>
> After that, I would upgrade all Ceph packages on the OSD nodes and
> restart all the OSD's. Then, after that, the MDSes and RGWs. Restarting
> the OSD's will probably take a while.
>
> If anyone has a hint on what I should expect to cause some extra load or
> waiting time, that would be great.
>
> Obviously, we have read
> https://ceph.com/releases/v14-2-0-nautilus-released/ , but I'm looking
> for real world experiences.
>
> Thanks!

--
Sustainable and modern Infrastructures by ungleich.ch
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx