Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/15/22 16:42, Fulvio Galeazzi wrote:

Hallo,
    I am on Nautilus and today, after upgrading the operating system (from CentOS 7 to CentOS 8 Stream) on a couple OSD servers and adding them back to the cluster, I noticed some PGs are still "activating".    The upgraded server are from the same "rack", and I have replica-3 pools with 1-per-rack rule, and 6+4 EC pools (in some cases, with SSD pool for metadata).

More details:
- on the two OSD servers I upgrade, I ran "systemctl stop ceph.target"
    and waited a while, to verify all PGs would remain "active"
- went on with the upgrade and ceph-ansible reconfig
- as soon as I started adding OSDs I saw "slow ops"
- to exclude possible effect of updated packages, I ran "yum update" on
    all OSD servers, and rebooted them one by one
- after 2-3 hours, the last OSD disks finally came up
- I am left with:
     about 1k "slow ops" (if I pause recovery, number ~stable but max
         age increasing)
     ~200 inactive PGs

What does a ceph -s show?

Did you set "osd to noout" during the upgrades?

You might query a PG to get more info: ceph pg $pg.id query

Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux