Lost TB for Object storage

CUZA Frédéric <frederic.cuza@xxxxxx> · Thu, 19 Jul 2018 16:19:04 +0000

Hi Guys,

We are running a Ceph Luminous 12.2.6 cluster.
The cluster is used both for RBD storage and Ceph Object Storage and is about 742 TB raw space.

We have an application that push snapshots of our VMs through RGW all seem to be fine except that we have a decorrelation between what the S3 API shows and the command “ceph df detail”
S3 API (python script) : 
Total : 44325.84438523278GB
Ceph df detail : 
NAME                                         ID       USED
default.rgw.buckets.data       59      104T

So it is about 60 TB…

We tried to clean the gc but nothing is shown :
# radosgw-admin gc list --include-all
[]
#

After that we tried to remove the orphans :
radosgw-admin orphans find –pool=

default.rgw.buckets.data --job-id=ophans_clean
radosgw-admin orphans finish --job-id=ophans_clean
It finds some orphans : 85, but the command finish seems not to work, so we decided to manually delete those ophans by piping the output of find in a log file.

Even after that we still have a huge decorrelation between what the s3 api show and ceph.

When we list object with s3 API I find the exact information that the application is returning. (Which is normal since the application use this API)
We listed object with the rados CLI and we found that there is more objects than we can found we the S3 API.

We are actually out of idea and we can’t figure out what’s wrong.

Some of you already faced this problem ?

Regards,

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com