Re: Size and capacity calculations questions

Aleksey Gutikov <aleksey.gutikov@xxxxxxxxxx> · Fri, 6 Dec 2019 14:27:20 +0300

On 6.12.19 13:29, Jochen Schulz wrote:
Hi!

We have a ceph cluster with 42 OSD in production as a server providing
mainly home-directories of users. Ceph is 14.2.4 nautilus.

We have 3 pools. One images (for rbd images) a cephfs_metadata and a
cephfs_data pool.

Our raw data is about 5.6T. All pools have replica size 3 and there are
only very little snapshots in the rbd images pool, the cephfspool doesnt
use snapshots.

How is it possible that the status tells us, that 21T/46T  is used,
because thats much more than 3 times the raw size.

Also, to make that more confusing, there as at least half of the cluster
free, and we get pg backfill_toofull after we added some OSDs lately.
The ceph dashboard tells aus the pool ist 82 % full and has only 4.5 T
free.

The autoscale module seems to calculate the 20T times 3 for the space
needed and thus has wrong numbers (see below).

Status of the cluster is added below too.

how can these size/capacity numbers be explained?
and, would be there a recommendation to change something?

Thank you in advance!

best
Jochen

# ceph -s

  cluster:
     id:     2b16167f-3f33-4580-a0e9-7a71978f403d
     health: HEALTH_ERR
             Degraded data redundancy (low space): 1 pg backfill_toofull
             1 subtrees have overcommitted pool target_size_bytes
             1 subtrees have overcommitted pool target_size_ratio
             2 pools have too many placement groups

   services:
     mon: 4 daemons, quorum jade,assam,matcha,jasmine (age 2d)
     mgr: earl(active, since 24h), standbys: assam
     mds: cephfs:1 {0=assam=up:active} 1 up:standby
     osd: 42 osds: 42 up (since 106m), 42 in (since 115m); 30 remapped pgs

   data:
     pools:   3 pools, 2048 pgs
     objects: 29.80M objects, 5.6 TiB
     usage:   21 TiB used, 25 TiB / 46 TiB avail
     pgs:     1164396/89411013 objects misplaced (1.302%)
              2018 active+clean
              22   active+remapped+backfill_wait
              7    active+remapped+backfilling
              1    active+remapped+backfill_wait+backfill_toofull

   io:
     client:   1.7 KiB/s rd, 516 KiB/s wr, 0 op/s rd, 28 op/s wr
     recovery: 9.2 MiB/s, 41 objects/s

# ceph osd pool autoscale-status
  POOL               SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO
TARGET RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE
  images           354.2G                3.0        46100G  0.0231
          1.0    1024          32  warn
  cephfs_metadata  13260M                3.0        595.7G  0.0652
          1.0     512           8  warn
  cephfs_data      20802G                3.0        46100G  1.3537
          1.0     512              warn

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

Please, provide output of ceph df and ceph osd df - that should explain 
both questions (21T and 82%).

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx