You should probably have used 2048 following the usual target of 100 PGs per OSD.
Just increase the mon_max_pg_per_osd option, ~200 is still okay-ish and your cluster will grow out of it :)
Paul
2018-08-01 19:55 GMT+02:00 Alexandros Afentoulis <alexaf+ceph@xxxxxxxxxxxx>:
Hello people :)
we are facing a situation quite similar to the one described here:
http://tracker.ceph.com/issues/23117
Namely:
we have a Luminous cluster consisting of 16 hosts, where each host holds
12 OSDs on spinning disks and 4 OSDs on SSDs. Let's forget the SSDs for
now since they're not used atm.
We have a Erasure Coding pool (k=6, m=3) with 4096 PGs, residing on the
spinning disks, with failure domain the host.
After getting a host (and their OSDs) out for maintenance, we're trying
to put the OSDs back in. While cluster starts recovering we observe
> Reduced data availability: 170 pgs inactive
and
> 170 activating+remapped
This eventually leads to slow/stucked requests and we have to get the
OSDs out again.
While searching around we came across the already mentioned issue on
tracker [1] and we're wondering "PG overdose protection" [2] is what
we're really facing now.
Our cluster features:
"mon_max_pg_per_osd": "200",
"osd_max_pg_per_osd_hard_ratio": "2.000000",
What is more, we observed that the PGs distribution among the OSDs is
not uniform, eg:
> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
> -1 711.29004 - 666T 165T 500T 0 0 - root default
> -17 44.68457 - 45757G 11266G 34491G 24.62 0.99 - host rd3-1427
> 9 hdd 3.66309 1.00000 3751G 976G 2774G 26.03 1.05 212 osd.9
> 30 hdd 3.66309 1.00000 3751G 961G 2789G 25.64 1.03 209 osd.30
> 46 hdd 3.66309 1.00000 3751G 902G 2848G 24.07 0.97 196 osd.46
> 61 hdd 3.66309 1.00000 3751G 877G 2873G 23.40 0.94 190 osd.61
> 76 hdd 3.66309 1.00000 3751G 984G 2766G 26.24 1.05 214 osd.76
> 92 hdd 3.66309 1.00000 3751G 894G 2856G 23.84 0.96 194 osd.92
> 107 hdd 3.66309 1.00000 3751G 881G 2869G 23.50 0.94 191 osd.107
> 123 hdd 3.66309 1.00000 3751G 973G 2777G 25.97 1.04 212 osd.123
> 138 hdd 3.66309 1.00000 3751G 975G 2775G 26.01 1.05 212 osd.138
> 156 hdd 3.66309 1.00000 3751G 813G 2937G 21.69 0.87 176 osd.156
> 172 hdd 3.66309 1.00000 3751G 1016G 2734G 27.09 1.09 221 osd.172
> 188 hdd 3.66309 1.00000 3751G 998G 2752G 26.62 1.07 217 osd.188
Could these OSDs, holding more than 200 PGs, contribute to the problem?
Is there any way to confirm that we're hitting the "PG overdose
protection"? If that's true how can restore our cluster back to normal.
Apart from getting these OSDs back to work, we're concerned about the
overall choices regarding the number of PGs (4096) for that (6,3) EC pool.
Any help appreciated,
Alex
[1] http://tracker.ceph.com/issues/23117
[2] https://ceph.com/community/new-luminous-pg-overdose- protection/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com