Re: [PHISHING VERDACHT] ceph is stuck after increasing pg_nums

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 11/4/22 09:45, Adrian Nicolae wrote:
Hi,

We have a Pacific cluster (16.2.4) with 30 servers and 30 osds. We started to increase the pg_num for the data bucket for more than a month, I usually added 64 pgs in every step I didn't have any issue. The cluster was healthy before increasing the pgs.

Today I've added 128 pgs  and the cluster is stuck with some unknown pgs and some other in peering state. I've restarted a few osds with slow_ops and even a few hosts but it didn't change anything. We don't have any networking issue .  Do you have any suggestion ?  Our service is completely down ...


*snipsnap*


Do some of the OSDs exceed the PGs per OSD limit? If this is the case, the affected OSDs will not allow peering, and tI/O to that OSDs will be stuck.

You can check the number of PGs in the 'ceph osd df tree' output. To solve this problem you can increase the limit e.g. by setting 'osd.mon_max_pg_per_osd' in 'ceph config'. The default limit is 200 AFAIK.


Regards,

Burkhard


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux