Igor, Etienne, Bogdan,

The system is a four node cluster. Each node has 12 3.8TB SSDs, and each SSD is an OSD.

I have not defined any separate DB / WAL devices - this cluster is mostly at cephadm defaults.

Everything is currently configured to have x3 replicas.

The system also does various RBD workloads from other pools.

There are no subvolumes and no snapshots on the CephFS volume in question.

The CephFS volume I am concerned about is called 'shared'. For the purposes of this question I am omitting information about the other pools.

[root@san1 ~]# rados df
POOL_NAME                     USED  OBJECTS  CLONES    COPIES MISSING_ON_PRIMARY  UNFOUND  DEGRADED       RD_OPS RD      WR_OPS       WR  USED COMPR  UNDER COMPR          41 TiB  3834689       0 11504067                   0        0         0   3219785418 175 TiB  9330001764  229 TiB     7.0 MiB       12 MiB cephfs.shared.meta         757 MiB       85       0 255                   0        0         0   5306018840    26 TiB  9170232158   24 TiB         0 B          0 B

total_objects    13169948
total_used       132 TiB
total_avail      33 TiB
total_space      166 TiB

[root@san1 ~]# ceph df detail
ssd    166 TiB  33 TiB  132 TiB   132 TiB      79.82
TOTAL  166 TiB  33 TiB  132 TiB   132 TiB      79.82

--- POOLS ---
POOL                       ID  PGS   STORED   (DATA)   (OMAP) OBJECTS     USED   (DATA)   (OMAP)  %USED  MAX AVAIL  QUOTA OBJECTS  QUOTA BYTES  DIRTY  USED COMPR  UNDER COMPR cephfs.shared.meta          3   32  251 MiB  208 MiB   42 MiB       84  752 MiB  625 MiB  127 MiB      0    3.4 TiB            N/A          N/A    N/A         0 B          0 B          4  512   14 TiB   14 TiB      0 B 3.83M   41 TiB   41 TiB      0 B  79.90    3.4 TiB N/A          N/A    N/A     7.0 MiB       12 MiB

[root@san1 ~]# ceph osd pool get size
size: 3

...however running 'du' in the root directory of the 'shared' volume says:

# du -sh .
5.5T    .

So yeah - 14TB is replicated to 41TB, that's fine, but 14TB is a lot more than 5.5TB, so... where is that space going?

On 14/03/2024 2:09 am, Igor Fedotov wrote:
Hi Thorn,

could you please share the output of "ceph df detail" command representing the problem?

And please give an overview of your OSD layout - amount of OSDs, shared or dedicated DB/WAL, main and DB volume sizes.



On 3/13/2024 5:58 AM, Thorne Lawler wrote:
Hi everyone!

My Ceph cluster (17.2.6) has a CephFS volume which is showing 41TB usage for the data pool, but there are only 5.5TB of files in it. There are fewer than 100 files on the filesystem in total, so where is all that space going?

How can I analyze my cephfs to understand what is using that space, and if possible, how can I reclaim that space?

Thank you.



