Hello everyone, we have a cluster with 72 HDDs with 16TB and 3 SSDs with 4TB each in 3 Nodes. The 3 SSDs are used to store the metadata for the CephFS filesystem. After the update to 18.2.2 the size of the metadata pool went from around 2 TiB to over 3.5 TiB filling up the OSDs. After a few days they started to come back down but they did it not at the same rate. So one was stuck at 90% while the other ones came down to 80% - still way too high. After a few more days the highest one hit the full ratio limit and stopped all writes to the filesystem. I zapped the drive and let it rebuild at the moment. From my estimations, it will end up at ~ 1.9 TiB (~50%) where it should be. The other 2 OSDs are still filling up slowly and will hit full ratio soon. It looks like there is a leak of data that is not properly cleaned up by the MDS or OSD. But it will only rebuild the data that are actually in use. Can someone help me to cleanup the other 2 OSDs without rebuilding them because this takes a week for each of them. Any help is welcome. Best Regards Lars Here is the current output of status: cluster: id: xxxx health: HEALTH_WARN 1 MDSs report oversized cache 1 clients failing to respond to cache pressure noscrub,nodeep-scrub flag(s) set 1 nearfull osd(s) Degraded data redundancy: 440823880/3637190859 objects degraded (12.120%), 121 pgs degraded, 121 pgs undersized 3 pool(s) nearfull 1 pools have too many placement groups services: mon: 3 daemons, quorum storage02,storage01,storage03 (age 2w) mgr: storage02.wcuuzg(active, since 2w), standbys: storage03.kevzol, storage01.kwcjsc mds: 1/1 daemons up, 1 standby, 1 hot standby osd: 75 osds: 75 up (since 2d), 75 in (since 2d); 121 remapped pgs flags noscrub,nodeep-scrub data: volumes: 1/1 healthy pools: 4 pools, 2209 pgs objects: 1.21G objects, 285 TiB usage: 866 TiB used, 323 TiB / 1.2 PiB avail pgs: 440823880/3637190859 objects degraded (12.120%) 2088 active+clean 101 active+undersized+degraded+remapped+backfill_wait 20 active+undersized+degraded+remapped+backfilling io: client: 113 MiB/s rd, 58 MiB/s wr, 640 op/s rd, 1.11k op/s wr recovery: 0 B/s, 2.22k keys/s, 1.22k objects/s Output of ceph fs status: cephfs - 27 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs.storage01.ieoyov Reqs: 441 /s 4885k 3576k 2247k 399k 0-s standby-replay cephfs.storage02.uonttf Evts: 419 /s 14.2M 7672k 5298k 0 POOL TYPE USED AVAIL metadata metadata 2460G 389G data data 854T 70.6T ssd-data data 28.5G 454G STANDBY MDS cephfs.storage03.pdgzsg MDS version: ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable) [image: ariadne.ai Logo] Lars Köppel Developer Email: lars.koeppel@xxxxxxxxxx Phone: +49 6221 5993580 <+4962215993580> ariadne.ai (Germany) GmbH Häusserstraße 3, 69115 Heidelberg Amtsgericht Mannheim, HRB 744040 Geschäftsführer: Dr. Fabian Svara https://ariadne.ai _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx