Re: Usage of devices in SSD pool vary very much

Konstantin Shalygin <k0ste@xxxxxxxx> · Wed, 2 Jan 2019 23:35:16 +0700

        On a medium sized cluster with device-classes, I am experiencing a
problem with the SSD pool:

root at adminnode:~# ceph osd df | grep ssd
ID CLASS WEIGHT  REWEIGHT SIZE    USE     AVAIL   %USE  VAR  PGS
 2   ssd 0.43700  1.00000  447GiB  254GiB  193GiB 56.77 1.28  50
 3   ssd 0.43700  1.00000  447GiB  208GiB  240GiB 46.41 1.04  58
 4   ssd 0.43700  1.00000  447GiB  266GiB  181GiB 59.44 1.34  55
30   ssd 0.43660  1.00000  447GiB  222GiB  225GiB 49.68 1.12  49
 6   ssd 0.43700  1.00000  447GiB  238GiB  209GiB 53.28 1.20  59
 7   ssd 0.43700  1.00000  447GiB  228GiB  220GiB 50.88 1.14  56
 8   ssd 0.43700  1.00000  447GiB  269GiB  178GiB 60.16 1.35  57
31   ssd 0.43660  1.00000  447GiB  231GiB  217GiB 51.58 1.16  56
34   ssd 0.43660  1.00000  447GiB  186GiB  261GiB 41.65 0.94  49
36   ssd 0.87329  1.00000  894GiB  364GiB  530GiB 40.68 0.92  91
37   ssd 0.87329  1.00000  894GiB  321GiB  573GiB 35.95 0.81  78
42   ssd 0.87329  1.00000  894GiB  375GiB  519GiB 41.91 0.94  92
43   ssd 0.87329  1.00000  894GiB  438GiB  456GiB 49.00 1.10  92
13   ssd 0.43700  1.00000  447GiB  249GiB  198GiB 55.78 1.25  72
14   ssd 0.43700  1.00000  447GiB  290GiB  158GiB 64.76 1.46  71
15   ssd 0.43700  1.00000  447GiB  368GiB 78.6GiB 82.41 1.85  78 <----
16   ssd 0.43700  1.00000  447GiB  253GiB  194GiB 56.66 1.27  70
19   ssd 0.43700  1.00000  447GiB  269GiB  178GiB 60.21 1.35  70
20   ssd 0.43700  1.00000  447GiB  312GiB  135GiB 69.81 1.57  77
21   ssd 0.43700  1.00000  447GiB  312GiB  135GiB 69.77 1.57  77
22   ssd 0.43700  1.00000  447GiB  269GiB  178GiB 60.10 1.35  67
38   ssd 0.43660  1.00000  447GiB  153GiB  295GiB 34.11 0.77  46
39   ssd 0.43660  1.00000  447GiB  127GiB  320GiB 28.37 0.64  38
40   ssd 0.87329  1.00000  894GiB  386GiB  508GiB 43.17 0.97  97
41   ssd 0.87329  1.00000  894GiB  375GiB  520GiB 41.88 0.94 113

This leads to just 1.2TB free space (some GBs away from NEAR_FULL pool).
Currently, the balancer plugin is off because it immediately crashed
the MGR in the past (on 12.2.5).
Since then I upgraded to 12.2.8 but did not re-enable the balancer. [I
am unable to find the bugtracker ID]

Would the balancer plugin correct this situation?
What happens if all MGRs die like they did on 12.2.5 because of the plugin?
Will the balancer take data from the most-unbalanced OSDs first?
Otherwise the OSD may fill up more then FULL which would cause the
whole pool to freeze (because the smallest OSD is taken into account
for free space calculation).
This would be the worst case as over 100 VMs would freeze, causing lot
of trouble. This is also the reason I did not try to enable the
balancer again.

      Please read this [1], all about Balancer with upmap mode.
    It's stable from 12.2.8 with upmap mode.

    k

    [1]
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-December/032002.html

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com