low available space due to unbalanced cluster(?)

Oebele Drijfhout <oebele.drijfhout@xxxxxxxxx> · Fri, 2 Sep 2022 15:55:49 +0200

Hello,

I'm new to Ceph and I recently inherited a 4 node cluster with 32 OSDs and
about 116TB raw space, which shows low available space, which I'm trying to
increase by enabling the balancer and lowering priority for the most-used
OSDs. My questions are: is what I did correct with the current state of the
cluster and can I do more to speed up rebalancing and will we actually make
more space available this way?

Some background info: Earlier this week MAX AVAIL on the cluster was 0 and
it was only then that we noticed something was wrong. We removed an unused
rbd image (about 3TB) and now we have a little under 1TB in available
space. We are adding about 75GB per day on this cluster.

[xxx@ceph02 ~]$ sudo ceph --cluster xxx df
RAW STORAGE:
    CLASS     SIZE        AVAIL      USED       RAW USED     %RAW USED
    hdd       116 TiB     47 TiB     69 TiB       69 TiB         59.69
    TOTAL     116 TiB     47 TiB     69 TiB       69 TiB         59.69

POOLS:
    POOL            ID     PGS      STORED      OBJECTS     USED
 %USED     MAX AVAIL
    xxx-pool        1      1024     130 B       3           192 KiB      0
      992 GiB
    yyy_data        6      128      23 TiB      12.08M      69 TiB
 95.98    992 GiB
    yyy_metadata    7      128      5.6 GiB     2.22M       6.1 GiB
0.21     992 GiB

Cluster status:

[xxx@ceph02 ~]$ sudo ceph --cluster xxx -s
  cluster:
    id:     91ba1ea6-bfec-4ddb-a8b5-9faf842f22c3
    health: HEALTH_WARN
            1 backfillfull osd(s)
            1 nearfull osd(s)
            3 pool(s) backfillfull
            Low space hindering backfill (add storage if this doesn't
resolve itself): 6 pgs backfill_toofull

  services:
    mon: 5 daemons, quorum a,b,c,d,e (age 5d)
    mgr: b(active, since 22h), standbys: a, c, d, e
    mds: registration_docs:1 {0=b=up:active} 3 up:standby
    osd: 32 osds: 32 up (since 19M), 32 in (since 3y); 36 remapped pgs

  task status:
    scrub status:
        mds.b: idle

  data:
    pools:   3 pools, 1280 pgs
    objects: 14.31M objects, 23 TiB
    usage:   70 TiB used, 47 TiB / 116 TiB avail
    pgs:     2587772/42925071 objects misplaced (6.029%)
             1244 active+clean
             17   active+remapped+backfilling
             13   active+remapped+backfill_wait
             4    active+remapped+backfill_toofull
             2    active+remapped+backfill_wait+backfill_toofull

  io:
    client:   331 KiB/s wr, 0 op/s rd, 0 op/s wr
    recovery: 141 MiB/s, 65 keys/s, 84 objects/s

Versions:

[xxx@ceph02 ~]$ rpm -qa | grep ceph
ceph-common-14.2.13-0.el7.x86_64
ceph-mds-14.2.13-0.el7.x86_64
ceph-osd-14.2.13-0.el7.x86_64
ceph-base-14.2.13-0.el7.x86_64
libcephfs2-14.2.13-0.el7.x86_64
python-ceph-argparse-14.2.13-0.el7.x86_64
ceph-selinux-14.2.13-0.el7.x86_64
ceph-mgr-14.2.13-0.el7.x86_64
ceph-14.2.13-0.el7.x86_64
python-cephfs-14.2.13-0.el7.x86_64
ceph-mon-14.2.13-0.el7.x86_64

It looks like the cluster is severely unbalanced and I guess that's
expected because the balancer was set to "off"

[xxx@ceph02 ~]$ sudo ceph --cluster xxx osd df
ID CLASS WEIGHT  REWEIGHT SIZE    RAW USE DATA    OMAP    META    AVAIL
%USE  VAR  PGS STATUS
 0   hdd 3.63869  1.00000 3.6 TiB 1.8 TiB 1.8 TiB 894 MiB 3.6 GiB 1.8 TiB
49.24 0.82 122     up
 1   hdd 3.63869  1.00000 3.6 TiB 1.1 TiB 1.1 TiB 581 MiB 2.4 GiB 2.5 TiB
31.07 0.52 123     up
 2   hdd 3.63869  1.00000 3.6 TiB 2.2 TiB 2.2 TiB 632 MiB 4.1 GiB 1.5 TiB
60.01 1.00 121     up
 3   hdd 3.63869  1.00000 3.6 TiB 2.7 TiB 2.7 TiB 672 MiB 5.5 GiB 975 GiB
73.84 1.24 122     up
 4   hdd 3.63869  0.94983 3.6 TiB 2.9 TiB 2.9 TiB 478 MiB 5.2 GiB 794 GiB
78.69 1.32 111     up
 5   hdd 3.63869  1.00000 3.6 TiB 2.5 TiB 2.5 TiB 900 MiB 4.7 GiB 1.1 TiB
69.52 1.16 122     up
 6   hdd 3.63869  1.00000 3.6 TiB 2.7 TiB 2.7 TiB 468 MiB 5.5 GiB 929 GiB
75.08 1.26 125     up
 7   hdd 3.63869  1.00000 3.6 TiB 1.6 TiB 1.6 TiB 731 MiB 3.2 GiB 2.0 TiB
44.54 0.75 122     up
 8   hdd 3.63869  1.00000 3.6 TiB 1.3 TiB 1.3 TiB 626 MiB 2.6 GiB 2.4 TiB
35.41 0.59 120     up
 9   hdd 3.63869  1.00000 3.6 TiB 2.5 TiB 2.5 TiB 953 MiB 4.8 GiB 1.1 TiB
69.61 1.17 122     up
10   hdd 3.63869  1.00000 3.6 TiB 2.0 TiB 2.0 TiB 526 MiB 3.9 GiB 1.6 TiB
55.64 0.93 121     up
11   hdd 3.63869  0.94983 3.6 TiB 3.4 TiB 3.4 TiB 476 MiB 6.2 GiB 242 GiB
93.50 1.57 101     up
12   hdd 3.63869  1.00000 3.6 TiB 1.4 TiB 1.4 TiB 688 MiB 3.0 GiB 2.2 TiB
39.44 0.66 117     up
13   hdd 3.63869  1.00000 3.6 TiB 1.3 TiB 1.3 TiB 738 MiB 2.8 GiB 2.3 TiB
35.98 0.60 124     up
14   hdd 3.63869  1.00000 3.6 TiB 2.8 TiB 2.8 TiB 582 MiB 5.1 GiB 879 GiB
76.40 1.28 123     up
15   hdd 3.63869  1.00000 3.6 TiB 2.5 TiB 2.5 TiB 566 MiB 4.6 GiB 1.1 TiB
68.81 1.15 124     up
16   hdd 3.63869  1.00000 3.6 TiB 1.5 TiB 1.5 TiB 625 MiB 3.1 GiB 2.2 TiB
40.23 0.67 121     up
17   hdd 3.63869  0.94983 3.6 TiB 3.2 TiB 3.2 TiB 704 MiB 6.1 GiB 427 GiB
88.55 1.48 112     up
18   hdd 3.63869  1.00000 3.6 TiB 2.0 TiB 2.0 TiB 143 MiB 3.6 GiB 1.7 TiB
54.12 0.91 124     up
19   hdd 3.63869  1.00000 3.6 TiB 2.7 TiB 2.7 TiB 522 MiB 5.0 GiB 977 GiB
73.79 1.24 126     up
20   hdd 3.63869  1.00000 3.6 TiB 2.4 TiB 2.4 TiB 793 MiB 4.5 GiB 1.2 TiB
66.79 1.12 119     up
21   hdd 3.63869  1.00000 3.6 TiB 1.8 TiB 1.8 TiB 609 MiB 3.6 GiB 1.8 TiB
49.50 0.83 122     up
22   hdd 3.63869  1.00000 3.6 TiB 2.7 TiB 2.7 TiB 600 MiB 5.0 GiB 979 GiB
73.73 1.23 122     up
23   hdd 3.63869  1.00000 3.6 TiB 953 GiB 950 GiB 579 MiB 2.4 GiB 2.7 TiB
25.57 0.43 118     up
24   hdd 3.63869  1.00000 3.6 TiB 1.8 TiB 1.8 TiB 491 MiB 3.4 GiB 1.8 TiB
49.82 0.83 121     up
25   hdd 3.63869  1.00000 3.6 TiB 2.1 TiB 2.1 TiB 836 MiB 4.5 GiB 1.5 TiB
59.07 0.99 121     up
26   hdd 3.63869  0.94983 3.6 TiB 2.9 TiB 2.9 TiB 467 MiB 5.2 GiB 794 GiB
78.69 1.32 104     up
27   hdd 3.63869  1.00000 3.6 TiB 2.0 TiB 2.0 TiB 861 MiB 3.8 GiB 1.7 TiB
54.09 0.91 123     up
28   hdd 3.63869  1.00000 3.6 TiB 1.9 TiB 1.9 TiB 262 MiB 3.6 GiB 1.8 TiB
51.00 0.85 121     up
29   hdd 3.63869  1.00000 3.6 TiB 2.7 TiB 2.7 TiB 998 MiB 5.1 GiB 937 GiB
74.86 1.25 123     up
30   hdd 3.63869  1.00000 3.6 TiB 1.8 TiB 1.8 TiB 1.1 GiB 3.6 GiB 1.8 TiB
50.77 0.85 122     up
31   hdd 3.63869  1.00000 3.6 TiB 2.3 TiB 2.3 TiB 689 MiB 4.4 GiB 1.3 TiB
63.96 1.07 121     up
                    TOTAL 116 TiB  70 TiB  69 TiB  20 GiB 134 GiB  47 TiB
59.73
MIN/MAX VAR: 0.43/1.57  STDDEV: 16.78

I enabled the balancer this morning, about 4 hours ago:

[xxx@ceph02 ~]$ sudo ceph --cluster xxx balancer status
{
    "last_optimize_duration": "0:00:01.119296",
    "plans": [],
    "mode": "crush-compat",
    "active": true,
    "optimize_result": "Optimization plan created successfully",
    "last_optimize_started": "Fri Sep  2 14:57:03 2022"
}

 ...and lowered priority to 0.85 for the most-used OSDs (however it looks
like it's slowly reverting to 1?):

[xxx@ceph02 ~]$ sudo ceph --cluster xxx osd tree
ID CLASS WEIGHT    TYPE NAME       STATUS REWEIGHT PRI-AFF
-1       116.43799 root default
-3        29.10950     host ceph01
 0   hdd   3.63869         osd.0       up  1.00000 1.00000
 1   hdd   3.63869         osd.1       up  1.00000 1.00000
 2   hdd   3.63869         osd.2       up  1.00000 1.00000
 3   hdd   3.63869         osd.3       up  1.00000 1.00000
 4   hdd   3.63869         osd.4       up  0.95981 1.00000
 5   hdd   3.63869         osd.5       up  1.00000 1.00000
 6   hdd   3.63869         osd.6       up  1.00000 1.00000
 7   hdd   3.63869         osd.7       up  1.00000 1.00000
-5        29.10950     host ceph02
 8   hdd   3.63869         osd.8       up  1.00000 1.00000
 9   hdd   3.63869         osd.9       up  1.00000 1.00000
10   hdd   3.63869         osd.10      up  1.00000 1.00000
11   hdd   3.63869         osd.11      up  0.95981 1.00000
12   hdd   3.63869         osd.12      up  1.00000 1.00000
13   hdd   3.63869         osd.13      up  1.00000 1.00000
14   hdd   3.63869         osd.14      up  1.00000 1.00000
15   hdd   3.63869         osd.15      up  1.00000 1.00000
-7        29.10950     host ceph03
16   hdd   3.63869         osd.16      up  1.00000 1.00000
17   hdd   3.63869         osd.17      up  0.95981 1.00000
18   hdd   3.63869         osd.18      up  1.00000 1.00000
19   hdd   3.63869         osd.19      up  1.00000 1.00000
20   hdd   3.63869         osd.20      up  1.00000 1.00000
21   hdd   3.63869         osd.21      up  1.00000 1.00000
22   hdd   3.63869         osd.22      up  1.00000 1.00000
23   hdd   3.63869         osd.23      up  1.00000 1.00000
-9        29.10950     host ceph04
24   hdd   3.63869         osd.24      up  1.00000 1.00000
25   hdd   3.63869         osd.25      up  1.00000 1.00000
26   hdd   3.63869         osd.26      up  0.95981 1.00000
27   hdd   3.63869         osd.27      up  1.00000 1.00000
28   hdd   3.63869         osd.28      up  1.00000 1.00000
29   hdd   3.63869         osd.29      up  1.00000 1.00000
30   hdd   3.63869         osd.30      up  1.00000 1.00000
31   hdd   3.63869         osd.31      up  1.00000 1.00000
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx