Re: Lost space or expected?

David Turner <drakonstein@xxxxxxxxx> · Fri, 23 Mar 2018 20:53:45 +0000

The first thing I looked at was if you had any snapshots/clones in your pools, but that count is 0 for you.  Second, I would look at seeing if you have orphaned objects from deleted RBDs.  You could check that by comparing a list of the rbd 'block_name_prefix' for all of the rbds in the pool with the prefix of object names in that pool.
rados ls | cut -d . -f1,2 | sort -u | grep ^rbd_data
for rbd in $(rbd ls); do rbd info --pool rbd-replica-ssd $rbd | awk '/block_name_prefix/ {print $2}'; done | sort

Alternatively you can let bash do the work for you by diff'ing the output of the commands directly

diff <(rados ls | cut -d . -f1,2 | sort -u | grep ^rbd_data) <(for rbd in $(rbd ls); do rbd info --pool rbd-replica-ssd $rbd | awk '/block_name_prefix/ {print $2}'; done | sort) | awk '/>/ {print $2}'

Anything listed are rbd prefixes with objects for rbds that do not exist.  If you do have any that show up here, you would want to triple check that the RBD doesn't actually exist and then work on finding the objects with that prefix and delete them with something like `rados ls | grep $prefix | rados rm`.

Also to note, rbd_data is not the only thing that uses the rbd prefix, there is also rbd_header, rbd_object_map, and perhaps other things that will also need to be cleaned up if you find orphans.  Hopefully you don't... but hopefully you do so you can get an answer to your question and a direction to go.

On Tue, Mar 20, 2018 at 9:54 AM Caspar Smit <casparsmit@xxxxxxxxxxx> wrote:
Hi all,
Here's the output of 'rados df' for one of our clusters (Luminous 12.2.2):

ec_pool   75563G 19450232      0 116701392                  0       0        0 385351922 27322G 800335856  294T
rbd       42969M    10881      0     32643                  0       0        0 615060980 14767G 970301192  207T
rbdssd      252G    65446      0    196338                  0       0        0  29392480  1581G 211205402 2601G

total_objects    19526559total_used       148T
total_avail      111T
total_space      259T

ec_pool (k=4, m=2)
rbd (size = 3/2)
rbdssd (size = 3/2)

If i calculate the space i should be using:

ec_pool = 75 TB x 1.5 = 112.5 TB  (4+2 is storage times 1.5 right?)
rbd = 42 GB x 3 = 150 GB
rbdssd = 252 GB x 3 = 756 GB

Let's say 114TB in total.

Why is there 148TB used space? (That's a 30TB difference)
Is this expected behaviour? A bug? (if so, how can i reclaim this space?)

kind regards,
Caspar

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com