Re: Deleting buckets and objects fails to reduce reported cluster usage

Yehuda Sadeh <yehuda@xxxxxxxxxx> · Fri, 28 Nov 2014 06:50:26 -0800

On Thu, Nov 27, 2014 at 9:22 PM, Ben <b@benjackson.email> wrote:
> On 2014-11-28 15:42, Yehuda Sadeh wrote:
>>
>> On Thu, Nov 27, 2014 at 2:15 PM, b <b@benjackson.email> wrote:
>>>
>>> On 2014-11-27 11:36, Yehuda Sadeh wrote:
>>>>
>>>>
>>>> On Wed, Nov 26, 2014 at 3:49 PM, b <b@benjackson.email> wrote:
>>>>>
>>>>>
>>>>> On 2014-11-27 10:21, Yehuda Sadeh wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 26, 2014 at 3:09 PM, b <b@benjackson.email> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2014-11-27 09:38, Yehuda Sadeh wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Nov 26, 2014 at 2:32 PM, b <b@benjackson.email> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I've been deleting a bucket which originally had 60TB of data in
>>>>>>>>> it,
>>>>>>>>> with
>>>>>>>>> our cluster doing only 1 replication, the total usage was 120TB.
>>>>>>>>>
>>>>>>>>> I've been deleting the objects slowly using S3 browser, and I can
>>>>>>>>> see
>>>>>>>>> the
>>>>>>>>> bucket usage is now down to around 2.5TB or 5TB with duplication,
>>>>>>>>> but
>>>>>>>>> the
>>>>>>>>> usage in the cluster has not changed.
>>>>>>>>>
>>>>>>>>> I've looked at garbage collection (radosgw-admin gc list --include
>>>>>>>>> all)
>>>>>>>>> and
>>>>>>>>> it just reports square brackets "[]"
>>>>>>>>>
>>>>>>>>> I've run radosgw-admin temp remove --date=2014-11-20, and it
>>>>>>>>> doesn't
>>>>>>>>> appear
>>>>>>>>> to have any effect.
>>>>>>>>>
>>>>>>>>> Is there a way to check where this space is being consumed?
>>>>>>>>>
>>>>>>>>> Running 'ceph df' the USED space in the buckets pool is not showing
>>>>>>>>> any
>>>>>>>>> of
>>>>>>>>> the 57TB that should have been freed up from the deletion so far.
>>>>>>>>>
>>>>>>>>> Running 'radosgw-admin bucket stats | jshon | grep size_kb_actual'
>>>>>>>>> and
>>>>>>>>> adding up all the buckets usage, this shows that the space has been
>>>>>>>>> freed
>>>>>>>>> from the bucket, but the cluster is all sorts of messed up.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ANY IDEAS? What can I look at?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Can you run 'radosgw-admin gc list --include-all'?
>>>>>>>>
>>>>>>>> Yehuda
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I've done it before, and it just returns square brackets [] (see
>>>>>>> below)
>>>>>>>
>>>>>>> radosgw-admin gc list --include-all
>>>>>>> []
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Do you know which of the rados pools have all that extra data? Try to
>>>>>> list that pool's objects, verify that there are no surprises there
>>>>>> (e.g., use 'rados -p <pool> ls').
>>>>>>
>>>>>> Yehuda
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I'm just running that command now, and its taking some time. There is a
>>>>> large number of objects.
>>>>>
>>>>> Once it has finished, what should I be looking for?
>>>>
>>>>
>>>>
>>>> I assume the pool in question is the one that holds your objects data?
>>>> You should be looking for objects that are not expected to exist
>>>> anymore, and objects of buckets that don't exist anymore. The problem
>>>> here is to identify these.
>>>> I suggest starting by looking at all the existing buckets, compose a
>>>> list of all the bucket prefixes for the existing buckets, and try to
>>>> look whether there are objects that have different prefixes.
>>>>
>>>> Yehuda
>>>
>>>
>>>
>>> Any ideas? I've found the prefix, the number of objects in the pool that
>>> match that prefix numbers in the 21 millions
>>> The actual 'radosgw-admin bucket stats' command reports it as only having
>>> 1.2 million.
>>
>>
>> Well, the objects you're seeing are raw objects, and since rgw stripes
>> the data, it is expected to have more raw objects than objects in the
>> bucket. Still, it seems that you have much too many of these. You can
>> try to check whether there are pending multipart uploads that were
>> never completed using the S3 api.
>> At the moment there's no easy way to figure out which raw objects are
>> not supposed to exist. The process would be like this:
>> 1. rados ls -p <data pool>
>> keep the list sorted
>> 2. list objects in the bucket
>> 3. for each object in (2), do: radosgw-admin object stat
>> --bucket=<bucket> --object=<object> --rgw-cache-enabled=false
>> (disabling the cache so that it goes quicker)
>> 4. look at the result of (3), and generate a list of all the parts.
>> 5. sort result of (4), compare it to (1)
>>
>> Note that if you're running firefly or later, the raw objects are not
>> specified explicitly in the command you run at (3), so you might need
>> a different procedure, e.g., find out the raw objects random string
>> that is being used, remove it from the list generated in 1, etc.)
>>
>> That's basically it.
>> I'll be interested to figure out what happened, why the garbage
>> collection didn't work correctly. You could try verifying that it's
>> working by:
>>  - create an object (let's say ~10MB in size).
>>  - radosgw-admin object stat --bucket=<bucket> --object=<object>
>>    (keep this info, see
>>  - remove the object
>>  - run radosgw-admin gc list --include-all and verify that the raw
>> parts are listed there
>>  - wait a few hours, repeat last step, see that the parts don't appear
>> there anymore
>>  - run rados -p <pool> ls, check to see if the raw objects still exist
>>
>> Yehuda
>>
>>>
>>> Not sure where to go from here, and our cluster is slowly filling up
>>> while
>>> not clearing any space.
>
>
>
> I did the last section:
>>
>> I'll be interested to figure out what happened, why the garbage
>> collection didn't work correctly. You could try verifying that it's
>> working by:
>>  - create an object (let's say ~10MB in size).
>>  - radosgw-admin object stat --bucket=<bucket> --object=<object>
>>    (keep this info, see
>>  - remove the object
>>  - run radosgw-admin gc list --include-all and verify that the raw
>> parts are listed there
>>  - wait a few hours, repeat last step, see that the parts don't appear
>> there anymore
>>  - run rados -p <pool> ls, check to see if the raw objects still exist
>
>
> I added the file, did a stat and it displayed the json output
> I removed the object and then tried to stat the object, this time it failed
> to stat the object
> After this, I ran the gc list include all command and it displayed nothing
> but the square brackets []

Was the object larger than 512k? Also, did you do it within the 300
seconds after removing the object?

There should exist a garbage collection pool (by default .rgw.gc, but
it can be something different if you configured your zone
differently), can you verify that you have it, and if so, what does it
contain?

Yehuda

>
> Maybe garbage collection isn't working properly..
>
> our gc settings are the following, we have 2 object gateways in our cluster
> too client.radosgw.obj01 and client.radosgw.obj02 (from ceph.conf)
> [client.radosgw.obj01]
>   rgw dns name = ceph.###.###
>   host = obj01
>   keyring = /etc/ceph/keyring.radosgw.obj01
>   rgw socket path = /tmp/radosgw.sock
>   log file = /var/log/ceph/radosgw.log
>   rgw data = /var/lib/ceph/radosgw/obj01
>   rgw thread pool size = 128
>   rgw print continue = True
>   debug rgw = 0
>   rgw enable ops log = False
>   log to stderr = False
>   rgw enable usage log = False
>   rgw gc max objs = 7877
>   rgw gc obj min wait = 300
>   rgw gc processor period = 600
>   rgw init timeout = 180
>   rgw gc processor max time = 600
> [client.radosgw.obj02]
>   rgw dns name = ceph.###.###
>   host = obj02
>   keyring = /etc/ceph/keyring.radosgw.obj02
>   rgw socket path = /tmp/radosgw.sock
>   log file = /var/log/ceph/radosgw.log
>   rgw data = /var/lib/ceph/radosgw/obj02
>   rgw thread pool size = 128
>   rgw print continue = True
>   debug rgw = 0
>   rgw enable ops log = False
>   log to stderr = False
>   rgw enable usage log = False
>   rgw gc max objs = 7877
>   rgw gc obj min wait = 300
>   rgw gc processor period = 600
>   rgw init timeout = 180
>   rgw gc processor max time = 600
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com