Hi everybody, We have a CEPH Luminous cluster with 184 SSD OSDs. About 1 year ago we noticed an abnormal growth in one of the cluster pools. This pool is configured with a mirror feature to another CEPH cluster in another datacenter. Below are the consumption of the two main pools. #PRIMARY CLUSTER [root@ceph01 ~]# ceph df detail GLOBAL: SIZE AVAIL RAW USED %RAW USED OBJECTS 659TiB 240TiB 419TiB 63.60 43.34M POOLS: NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED images-dr 8 N/A N/A 1.24TiB 6.42 18.2TiB 163522 163.52k 42.6GiB 247MiB 3.73TiB volumes 11 N/A N/A 59.1TiB 68.46 27.2TiB 18945218 18.95M 4.81GiB 4.16GiB 118TiB volumes-dr 12 N/A N/A 143TiB 83.99 27.2TiB 22108005 22.11M 1.84GiB 918MiB 286TiB To verify the actual consumption of images within the pools, we run the rbd diff command within the pool and then add up all the results. for j in $(rbd ls volumes) do i=$((i+1)) size=$(rbd diff volumes/$j | awk '{ SUM += $2 } END { print SUM/1024/1024/1024 " GB" }') echo "$j;$size" >> /var/lib/report-volumes/`date +%F`-volumes.txt done In the "volumes" pool, we got a value of 56,455.43GB (56TB) - a value close to that shown by the ceph df command (59.1TiB). for j in $(rbd ls volumes-dr) do i=$((i+1)) size=$(rbd diff volumes-dr/$j | awk '{ SUM += $2 } END { print SUM/1024/1024/1024 " GB" }') echo "$j;$size" >> /var/lib/report-volumes/`date +%F`-volumes.txt done In the "volumes-dr" pool, we got the value of 40,726.51 (38TB) - a much lower value than the one shown by the ceph df command (143TiB) Another feature of these two pools is that daily snapshots of all images are taken and each image has a retention period (daily, weekly or monthly) I thought this anomaly could be something related to the snapshots, but we have already purged all the snapshots without significant reflections on the pools. I've already searched forums about unclaimed space, but haven't found anything concrete. As for the mirrored pool in the DR datacenter, the value shown is a little more real with the one obtained with the rbd diff - 56.5TiB. We use the "pool" type mirror and both the source and the destination currently have the same amount of images: 223 #CLUSTER DR [root@ceph-dr01 ~]# ceph df detail GLOBAL: SIZE AVAIL RAW USED %RAW USED OBJECTS 217TiB 97.6TiB 119TiB 54.98 16.73M POOLS: NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED images-dr 1 N/A N/A 1.37TiB 6.89 18.5TiB 179953 179.95k 390MiB 198MiB 4.11TiB volumes-dr 3 N/A N/A 56.5TiB 67.03 27.8TiB 16548170 16.55M 23.2GiB 59.0GiB 113TiB Other infrastructure information: 4 virtualized monitors on CentOS 7.9.2009 (Core) 10 storage nodes (99 osds) with CentOS 7.9.2009 and Ceph 12.2.12 8 storage nodes (84 osds) with CentOS 7.9.2009 and Ceph 12.2.13 [root@ceph01]# ceph versions { "mon": { "ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e) luminous (stable)": 4 }, "mgr": { "ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e) luminous (stable)": 4 }, "osd": { "ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)": 99, "ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e) luminous (stable)": 84 }, "mds": {}, "rbd-mirror": { "ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e) luminous (stable)": 1 }, "overall": { "ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)": 99, "ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e) luminous (stable)": 93 } } Another information is that apparently this anomaly started after the inclusion of the last 4 storage nodes that had disks of different sizes - 3.8TB (the other 14 storage nodes are 4TB disks). But at the same time I think if the disks were the problem then the other pool would also be affected. Has anyone ever faced such a situation? João Victor Soares. Binario Cloud -- *Aviso: esta mensagem é destinada exclusivamente para a(s) pessoa(s) a quem é dirigida, podendo conter informação confidencial e legalmente protegida. Se você não for o destinatário, desde já fica notificado de abster-se a divulgar, copiar, distribuir, examinar ou, de qualquer forma, utilizar a informação contida nesta mensagem, por ser ilegal. Caso tenha recebido esta mensagem por engano, pedimos que responda, informando o acontecido.* _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx