Hi,
I have a rook-provisioned cluster to be used for RBDs only. I have 2 pools named replicated-metadata-pool and ec-data-pool. EC parameters are 6+3. I've been writing some data to this cluster for some time and noticed that the reported usage is not what I was expecting.
# ceph df
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 5.4 PiB 4.3 PiB 1.2 PiB 1.2 PiB 21.77
TOTAL 5.4 PiB 4.3 PiB 1.2 PiB 1.2 PiB 21.77
POOLS:
POOL ID STORED OBJECTS USED %USED MAX AVAIL
replicated-metadata-pool 1 90 KiB 408 38 MiB 0 1.2 PiB
ec-data-pool 2 722 TiB 191.64M 1.2 PiB 25.04 2.4 PiB
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 5.4 PiB 4.3 PiB 1.2 PiB 1.2 PiB 21.77
TOTAL 5.4 PiB 4.3 PiB 1.2 PiB 1.2 PiB 21.77
POOLS:
POOL ID STORED OBJECTS USED %USED MAX AVAIL
replicated-metadata-pool 1 90 KiB 408 38 MiB 0 1.2 PiB
ec-data-pool 2 722 TiB 191.64M 1.2 PiB 25.04 2.4 PiB
Since these numbers are rounded a bit too much, I generally use prometheus metrics on mgr, which are as follows:
ceph_pool_stored : 793,746 G for ec-data-pool and 92323 for replicated-metadata-pool
ceph_pool_stored_raw: 1,190,865 G for ec-data-pool and 99213 for replicated-metadata-pool
ceph_cluster_total_used_bytes: 1,329,374 G
ceph_cluster_total_used_raw_bytes: 1,333,013 G
sum(ceph_bluefs_db_used_bytes) : 3,638 G
So ceph_pool_stored for the EC pool is a bit higher than the total used space of the formatted RBDs. I think that's because of the sparse nature and deleted blocks not being fstrimmed yet. That's OK.
ceph_pool_stored_raw is almost exactly 1.5x ceph_pool_stored which is what I'd expect considering EC parameters of 6+3.
What I can't find is the 138,509 G difference between the ceph_cluster_total_used_bytes and ceph_pool_stored_raw. This is not static BTW, checking the same data historically shows we have about 1.12x of what we expect. This seems to make our 1.5x EC overhead a 1.68x overhead in reality. Anyone have any ideas for why this is the case?
We also have a ceph_cluster_total_used_raw_bytes metric, I believe to be close to data+metadata. Which is why I tried to show with sum(ceph_bluefs_db_used_bytes). Is that correct?
Best,
--
erdem agaoglu
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx