On 9/15/22 16:42, Fulvio Galeazzi wrote:
Hallo,
I am on Nautilus and today, after upgrading the operating system
(from CentOS 7 to CentOS 8 Stream) on a couple OSD servers and adding
them back to the cluster, I noticed some PGs are still "activating".
The upgraded server are from the same "rack", and I have replica-3
pools with 1-per-rack rule, and 6+4 EC pools (in some cases, with SSD
pool for metadata).
More details:
- on the two OSD servers I upgrade, I ran "systemctl stop ceph.target"
and waited a while, to verify all PGs would remain "active"
- went on with the upgrade and ceph-ansible reconfig
- as soon as I started adding OSDs I saw "slow ops"
- to exclude possible effect of updated packages, I ran "yum update" on
all OSD servers, and rebooted them one by one
- after 2-3 hours, the last OSD disks finally came up
- I am left with:
about 1k "slow ops" (if I pause recovery, number ~stable but max
age increasing)
~200 inactive PGs
What does a ceph -s show?
Did you set "osd to noout" during the upgrades?
You might query a PG to get more info: ceph pg $pg.id query
Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx