Disable peering of some pool

Jan Pekař - Imatic <jan.pekar@xxxxxxxxx> · Wed, 16 Mar 2022 16:34:34 +0100

Hi all,

we have problem on our production cluster running nautilus (14.2.22).

Cluster is almost full and few month ago we noticed issues with slow peering - when we restart any osd (or host) it takes hours to finish 
peering process, instead of minutes.

We noticed, that some pool contains 90k in 300GB objects per PG, so we decided to increase pg_num on that pool so individual PG is peered 
quickly. During that state we got into stuck PG inactive for hours and peering not finised, and some OSD went down with this error 
https://tracker.ceph.com/issues/51168

We decided to restart all osds and waiting, but problem with slow peering persists.

Is there any way how to get cluster healthy? Or disable peering of some pool so other pools with RBD images get peered and get online and 
after that try to peer that big pool?

Thank you for help, it is urgent situation

With regards
Jan Pekar

--

============
Ing. Jan Pekař
jan.pekar@xxxxxxxxx
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
https://www.imatic.cz | +420326555326
============
--

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx