CephFS metadata pool size

Lars Köppel <lars.koeppel@xxxxxxxxxx> · Wed, 5 Jun 2024 12:08:17 +0200

Hello everyone,

we have a cluster with 72 HDDs with 16TB and 3 SSDs with 4TB each in 3
Nodes.
The 3 SSDs are used to store the metadata for the CephFS filesystem. After
the update to 18.2.2 the size of the metadata pool went from around 2 TiB
to over 3.5 TiB filling up the OSDs.
After a few days they started to come back down but they did it not at the
same rate. So one was stuck at 90% while the other ones came down to 80% -
still way too high.

After a few more days the highest one hit the full ratio limit and
stopped all writes to the filesystem. I zapped the drive and let it rebuild
at the moment. From my estimations, it will end up at ~ 1.9 TiB (~50%)
where it should be. The other 2 OSDs are still filling up slowly and will
hit full ratio soon. It looks like there is a leak of data that is not
properly cleaned up by the MDS or OSD. But it will only rebuild the data
that are actually in use. Can someone help me to cleanup the other 2 OSDs
without rebuilding them because this takes a week for each of them.

Any help is welcome.
Best Regards
Lars

Here is the current output of status:
cluster:
    id:     xxxx
    health: HEALTH_WARN
            1 MDSs report oversized cache
            1 clients failing to respond to cache pressure
            noscrub,nodeep-scrub flag(s) set
            1 nearfull osd(s)
            Degraded data redundancy: 440823880/3637190859 objects degraded
(12.120%), 121 pgs degraded, 121 pgs undersized
            3 pool(s) nearfull
            1 pools have too many placement groups

  services:
    mon: 3 daemons, quorum storage02,storage01,storage03 (age 2w)
    mgr: storage02.wcuuzg(active, since 2w), standbys: storage03.kevzol,
storage01.kwcjsc
    mds: 1/1 daemons up, 1 standby, 1 hot standby
    osd: 75 osds: 75 up (since 2d), 75 in (since 2d); 121 remapped pgs
         flags noscrub,nodeep-scrub

  data:
    volumes: 1/1 healthy
    pools:   4 pools, 2209 pgs
    objects: 1.21G objects, 285 TiB
    usage:   866 TiB used, 323 TiB / 1.2 PiB avail
    pgs:     440823880/3637190859 objects degraded (12.120%)
             2088 active+clean
             101  active+undersized+degraded+remapped+backfill_wait
             20   active+undersized+degraded+remapped+backfilling

  io:
    client:   113 MiB/s rd, 58 MiB/s wr, 640 op/s rd, 1.11k op/s wr
    recovery: 0 B/s, 2.22k keys/s, 1.22k objects/s

Output of ceph fs status:
cephfs - 27 clients
======
RANK      STATE                 MDS               ACTIVITY     DNS    INOS
  DIRS   CAPS
 0        active      cephfs.storage01.ieoyov  Reqs:  441 /s  4885k  3576k
 2247k   399k
0-s   standby-replay  cephfs.storage02.uonttf  Evts:  419 /s  14.2M  7672k
 5298k     0
  POOL      TYPE     USED  AVAIL
metadata  metadata  2460G   389G
  data      data     854T  70.6T
ssd-data    data    28.5G   454G
      STANDBY MDS
cephfs.storage03.pdgzsg
MDS version: ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2)
reef (stable)

[image: ariadne.ai Logo] Lars Köppel
Developer
Email: lars.koeppel@xxxxxxxxxx
Phone: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx