Hello people :) we are facing a situation quite similar to the one described here: http://tracker.ceph.com/issues/23117 Namely: we have a Luminous cluster consisting of 16 hosts, where each host holds 12 OSDs on spinning disks and 4 OSDs on SSDs. Let's forget the SSDs for now since they're not used atm. We have a Erasure Coding pool (k=6, m=3) with 4096 PGs, residing on the spinning disks, with failure domain the host. After getting a host (and their OSDs) out for maintenance, we're trying to put the OSDs back in. While cluster starts recovering we observe > Reduced data availability: 170 pgs inactive and > 170 activating+remapped This eventually leads to slow/stucked requests and we have to get the OSDs out again. While searching around we came across the already mentioned issue on tracker [1] and we're wondering "PG overdose protection" [2] is what we're really facing now. Our cluster features: "mon_max_pg_per_osd": "200", "osd_max_pg_per_osd_hard_ratio": "2.000000", What is more, we observed that the PGs distribution among the OSDs is not uniform, eg: > ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME > -1 711.29004 - 666T 165T 500T 0 0 - root default > -17 44.68457 - 45757G 11266G 34491G 24.62 0.99 - host rd3-1427 > 9 hdd 3.66309 1.00000 3751G 976G 2774G 26.03 1.05 212 osd.9 > 30 hdd 3.66309 1.00000 3751G 961G 2789G 25.64 1.03 209 osd.30 > 46 hdd 3.66309 1.00000 3751G 902G 2848G 24.07 0.97 196 osd.46 > 61 hdd 3.66309 1.00000 3751G 877G 2873G 23.40 0.94 190 osd.61 > 76 hdd 3.66309 1.00000 3751G 984G 2766G 26.24 1.05 214 osd.76 > 92 hdd 3.66309 1.00000 3751G 894G 2856G 23.84 0.96 194 osd.92 > 107 hdd 3.66309 1.00000 3751G 881G 2869G 23.50 0.94 191 osd.107 > 123 hdd 3.66309 1.00000 3751G 973G 2777G 25.97 1.04 212 osd.123 > 138 hdd 3.66309 1.00000 3751G 975G 2775G 26.01 1.05 212 osd.138 > 156 hdd 3.66309 1.00000 3751G 813G 2937G 21.69 0.87 176 osd.156 > 172 hdd 3.66309 1.00000 3751G 1016G 2734G 27.09 1.09 221 osd.172 > 188 hdd 3.66309 1.00000 3751G 998G 2752G 26.62 1.07 217 osd.188 Could these OSDs, holding more than 200 PGs, contribute to the problem? Is there any way to confirm that we're hitting the "PG overdose protection"? If that's true how can restore our cluster back to normal. Apart from getting these OSDs back to work, we're concerned about the overall choices regarding the number of PGs (4096) for that (6,3) EC pool. Any help appreciated, Alex [1] http://tracker.ceph.com/issues/23117 [2] https://ceph.com/community/new-luminous-pg-overdose-protection/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com