Re: Deleting buckets and objects fails to reduce reported cluster usage

Yehuda Sadeh <yehuda@xxxxxxxxxx> · Mon, 1 Dec 2014 16:25:15 -0800

On Mon, Dec 1, 2014 at 4:23 PM, Ben <b@benjackson.email> wrote:
> On 2014-12-02 11:21, Yehuda Sadeh wrote:
>>
>> On Mon, Dec 1, 2014 at 3:47 PM, Ben <b@benjackson.email> wrote:
>>>
>>> On 2014-12-02 10:40, Yehuda Sadeh wrote:
>>>>
>>>>
>>>> On Mon, Dec 1, 2014 at 3:20 PM, Ben <b@benjackson.email> wrote:
>>>>>
>>>>>
>>>>> On 2014-12-02 09:25, Yehuda Sadeh wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Dec 1, 2014 at 2:10 PM, Ben <b@benjackson.email> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2014-12-02 08:39, Yehuda Sadeh wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Nov 29, 2014 at 2:26 PM, Ben <b@benjackson.email> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 29/11/14 11:40, Yehuda Sadeh wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Nov 28, 2014 at 1:38 PM, Ben <b@benjackson.email> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 29/11/14 01:50, Yehuda Sadeh wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Nov 27, 2014 at 9:22 PM, Ben <b@benjackson.email> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2014-11-28 15:42, Yehuda Sadeh wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Nov 27, 2014 at 2:15 PM, b <b@benjackson.email> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2014-11-27 11:36, Yehuda Sadeh wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Nov 26, 2014 at 3:49 PM, b <b@benjackson.email>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 2014-11-27 10:21, Yehuda Sadeh wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Nov 26, 2014 at 3:09 PM, b <b@benjackson.email>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 2014-11-27 09:38, Yehuda Sadeh wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Nov 26, 2014 at 2:32 PM, b <b@benjackson.email>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I've been deleting a bucket which originally had 60TB
>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> data
>>>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> it,
>>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>> our cluster doing only 1 replication, the total usage
>>>>>>>>>>>>>>>>>>>>> was
>>>>>>>>>>>>>>>>>>>>> 120TB.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I've been deleting the objects slowly using S3 browser,
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> bucket usage is now down to around 2.5TB or 5TB with
>>>>>>>>>>>>>>>>>>>>> duplication,
>>>>>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> usage in the cluster has not changed.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I've looked at garbage collection (radosgw-admin gc
>>>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>>> --include
>>>>>>>>>>>>>>>>>>>>> all)
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> it just reports square brackets "[]"
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I've run radosgw-admin temp remove --date=2014-11-20,
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>>>>>>>> appear
>>>>>>>>>>>>>>>>>>>>> to have any effect.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Is there a way to check where this space is being
>>>>>>>>>>>>>>>>>>>>> consumed?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Running 'ceph df' the USED space in the buckets pool is
>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>> showing
>>>>>>>>>>>>>>>>>>>>> any
>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> the 57TB that should have been freed up from the
>>>>>>>>>>>>>>>>>>>>> deletion
>>>>>>>>>>>>>>>>>>>>> so
>>>>>>>>>>>>>>>>>>>>> far.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Running 'radosgw-admin bucket stats | jshon | grep
>>>>>>>>>>>>>>>>>>>>> size_kb_actual'
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> adding up all the buckets usage, this shows that the
>>>>>>>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>>>>>>> freed
>>>>>>>>>>>>>>>>>>>>> from the bucket, but the cluster is all sorts of messed
>>>>>>>>>>>>>>>>>>>>> up.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> ANY IDEAS? What can I look at?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Can you run 'radosgw-admin gc list --include-all'?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Yehuda
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I've done it before, and it just returns square brackets
>>>>>>>>>>>>>>>>>>> []
>>>>>>>>>>>>>>>>>>> (see
>>>>>>>>>>>>>>>>>>> below)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> radosgw-admin gc list --include-all
>>>>>>>>>>>>>>>>>>> []
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Do you know which of the rados pools have all that extra
>>>>>>>>>>>>>>>>>> data?
>>>>>>>>>>>>>>>>>> Try
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> list that pool's objects, verify that there are no
>>>>>>>>>>>>>>>>>> surprises
>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>> (e.g., use 'rados -p <pool> ls').
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yehuda
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm just running that command now, and its taking some
>>>>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> large number of objects.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Once it has finished, what should I be looking for?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I assume the pool in question is the one that holds your
>>>>>>>>>>>>>>>> objects
>>>>>>>>>>>>>>>> data?
>>>>>>>>>>>>>>>> You should be looking for objects that are not expected to
>>>>>>>>>>>>>>>> exist
>>>>>>>>>>>>>>>> anymore, and objects of buckets that don't exist anymore.
>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>>> here is to identify these.
>>>>>>>>>>>>>>>> I suggest starting by looking at all the existing buckets,
>>>>>>>>>>>>>>>> compose
>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> list of all the bucket prefixes for the existing buckets,
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> try
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> look whether there are objects that have different prefixes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yehuda
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any ideas? I've found the prefix, the number of objects in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> pool
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> match that prefix numbers in the 21 millions
>>>>>>>>>>>>>>> The actual 'radosgw-admin bucket stats' command reports it as
>>>>>>>>>>>>>>> only
>>>>>>>>>>>>>>> having
>>>>>>>>>>>>>>> 1.2 million.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well, the objects you're seeing are raw objects, and since rgw
>>>>>>>>>>>>>> stripes
>>>>>>>>>>>>>> the data, it is expected to have more raw objects than objects
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> bucket. Still, it seems that you have much too many of these.
>>>>>>>>>>>>>> You
>>>>>>>>>>>>>> can
>>>>>>>>>>>>>> try to check whether there are pending multipart uploads that
>>>>>>>>>>>>>> were
>>>>>>>>>>>>>> never completed using the S3 api.
>>>>>>>>>>>>>> At the moment there's no easy way to figure out which raw
>>>>>>>>>>>>>> objects
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>> not supposed to exist. The process would be like this:
>>>>>>>>>>>>>> 1. rados ls -p <data pool>
>>>>>>>>>>>>>> keep the list sorted
>>>>>>>>>>>>>> 2. list objects in the bucket
>>>>>>>>>>>>>> 3. for each object in (2), do: radosgw-admin object stat
>>>>>>>>>>>>>> --bucket=<bucket> --object=<object> --rgw-cache-enabled=false
>>>>>>>>>>>>>> (disabling the cache so that it goes quicker)
>>>>>>>>>>>>>> 4. look at the result of (3), and generate a list of all the
>>>>>>>>>>>>>> parts.
>>>>>>>>>>>>>> 5. sort result of (4), compare it to (1)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Note that if you're running firefly or later, the raw objects
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> specified explicitly in the command you run at (3), so you
>>>>>>>>>>>>>> might
>>>>>>>>>>>>>> need
>>>>>>>>>>>>>> a different procedure, e.g., find out the raw objects random
>>>>>>>>>>>>>> string
>>>>>>>>>>>>>> that is being used, remove it from the list generated in 1,
>>>>>>>>>>>>>> etc.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's basically it.
>>>>>>>>>>>>>> I'll be interested to figure out what happened, why the
>>>>>>>>>>>>>> garbage
>>>>>>>>>>>>>> collection didn't work correctly. You could try verifying that
>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>> working by:
>>>>>>>>>>>>>>    - create an object (let's say ~10MB in size).
>>>>>>>>>>>>>>    - radosgw-admin object stat --bucket=<bucket>
>>>>>>>>>>>>>> --object=<object>
>>>>>>>>>>>>>>      (keep this info, see
>>>>>>>>>>>>>>    - remove the object
>>>>>>>>>>>>>>    - run radosgw-admin gc list --include-all and verify that
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> raw
>>>>>>>>>>>>>> parts are listed there
>>>>>>>>>>>>>>    - wait a few hours, repeat last step, see that the parts
>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>> appear
>>>>>>>>>>>>>> there anymore
>>>>>>>>>>>>>>    - run rados -p <pool> ls, check to see if the raw objects
>>>>>>>>>>>>>> still
>>>>>>>>>>>>>> exist
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yehuda
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Not sure where to go from here, and our cluster is slowly
>>>>>>>>>>>>>>> filling
>>>>>>>>>>>>>>> up
>>>>>>>>>>>>>>> while
>>>>>>>>>>>>>>> not clearing any space.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I did the last section:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'll be interested to figure out what happened, why the
>>>>>>>>>>>>>> garbage
>>>>>>>>>>>>>> collection didn't work correctly. You could try verifying that
>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>> working by:
>>>>>>>>>>>>>>    - create an object (let's say ~10MB in size).
>>>>>>>>>>>>>>    - radosgw-admin object stat --bucket=<bucket>
>>>>>>>>>>>>>> --object=<object>
>>>>>>>>>>>>>>      (keep this info, see
>>>>>>>>>>>>>>    - remove the object
>>>>>>>>>>>>>>    - run radosgw-admin gc list --include-all and verify that
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> raw
>>>>>>>>>>>>>> parts are listed there
>>>>>>>>>>>>>>    - wait a few hours, repeat last step, see that the parts
>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>> appear
>>>>>>>>>>>>>> there anymore
>>>>>>>>>>>>>>    - run rados -p <pool> ls, check to see if the raw objects
>>>>>>>>>>>>>> still
>>>>>>>>>>>>>> exist
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I added the file, did a stat and it displayed the json output
>>>>>>>>>>>>> I removed the object and then tried to stat the object, this
>>>>>>>>>>>>> time
>>>>>>>>>>>>> it
>>>>>>>>>>>>> failed
>>>>>>>>>>>>> to stat the object
>>>>>>>>>>>>> After this, I ran the gc list include all command and it
>>>>>>>>>>>>> displayed
>>>>>>>>>>>>> nothing
>>>>>>>>>>>>> but the square brackets []
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Was the object larger than 512k? Also, did you do it within the
>>>>>>>>>>>> 300
>>>>>>>>>>>> seconds after removing the object?
>>>>>>>>>>>>
>>>>>>>>>>>> There should exist a garbage collection pool (by default
>>>>>>>>>>>> .rgw.gc,
>>>>>>>>>>>> but
>>>>>>>>>>>> it can be something different if you configured your zone
>>>>>>>>>>>> differently), can you verify that you have it, and if so, what
>>>>>>>>>>>> does
>>>>>>>>>>>> it
>>>>>>>>>>>> contain?
>>>>>>>>>>>>
>>>>>>>>>>>> Yehuda
>>>>>>>>>>>>
>>>>>>>>>>> Yes, the object was 10M. As soon as I had deleted it from the
>>>>>>>>>>> bucket,
>>>>>>>>>>> I
>>>>>>>>>>> ran
>>>>>>>>>>> the command to check garbage collection.
>>>>>>>>>>> There is a .rgw.gc pool, we haven't changed it from default. It
>>>>>>>>>>> contains
>>>>>>>>>>> a
>>>>>>>>>>> number of objects ~7800, but the size of the files is 0kb
>>>>>>>>>>>
>>>>>>>>>> They're expected to be 0kb, the data only resides in their omap,
>>>>>>>>>> and
>>>>>>>>>> that's not reflected in the objects size. You could run 'rados
>>>>>>>>>> listomapkeys' on these.
>>>>>>>>>>
>>>>>>>>>>>>> Maybe garbage collection isn't working properly..
>>>>>>>>>>>>>
>>>>>>>>>>>>> our gc settings are the following, we have 2 object gateways in
>>>>>>>>>>>>> our
>>>>>>>>>>>>> cluster
>>>>>>>>>>>>> too client.radosgw.obj01 and client.radosgw.obj02 (from
>>>>>>>>>>>>> ceph.conf)
>>>>>>>>>>>>> [client.radosgw.obj01]
>>>>>>>>>>>>>     rgw dns name = ceph.###.###
>>>>>>>>>>>>>     host = obj01
>>>>>>>>>>>>>     keyring = /etc/ceph/keyring.radosgw.obj01
>>>>>>>>>>>>>     rgw socket path = /tmp/radosgw.sock
>>>>>>>>>>>>>     log file = /var/log/ceph/radosgw.log
>>>>>>>>>>>>>     rgw data = /var/lib/ceph/radosgw/obj01
>>>>>>>>>>>>>     rgw thread pool size = 128
>>>>>>>>>>>>>     rgw print continue = True
>>>>>>>>>>>>>     debug rgw = 0
>>>>>>>>>>>>>     rgw enable ops log = False
>>>>>>>>>>>>>     log to stderr = False
>>>>>>>>>>>>>     rgw enable usage log = False
>>>>>>>>>>>>>     rgw gc max objs = 7877
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> You should put this line (rgw gx max objs) in the global section
>>>>>>>>>> of
>>>>>>>>>> your ceph.conf. Either that, or run your radosgw-admin command
>>>>>>>>>> with
>>>>>>>>>> '-n client.radosgw.obj02'. That might change some of the results
>>>>>>>>>> you're seeing (radosgw-admin gc list --include-all, etc.).
>>>>>>>>>>
>>>>>>>>>> Yehuda
>>>>>>>>>>
>>>>>>>>>>>>>     rgw gc obj min wait = 300
>>>>>>>>>>>>>     rgw gc processor period = 600
>>>>>>>>>>>>>     rgw init timeout = 180
>>>>>>>>>>>>>     rgw gc processor max time = 600
>>>>>>>>>>>>> [client.radosgw.obj02]
>>>>>>>>>>>>>     rgw dns name = ceph.###.###
>>>>>>>>>>>>>     host = obj02
>>>>>>>>>>>>>     keyring = /etc/ceph/keyring.radosgw.obj02
>>>>>>>>>>>>>     rgw socket path = /tmp/radosgw.sock
>>>>>>>>>>>>>     log file = /var/log/ceph/radosgw.log
>>>>>>>>>>>>>     rgw data = /var/lib/ceph/radosgw/obj02
>>>>>>>>>>>>>     rgw thread pool size = 128
>>>>>>>>>>>>>     rgw print continue = True
>>>>>>>>>>>>>     debug rgw = 0
>>>>>>>>>>>>>     rgw enable ops log = False
>>>>>>>>>>>>>     log to stderr = False
>>>>>>>>>>>>>     rgw enable usage log = False
>>>>>>>>>>>>>     rgw gc max objs = 7877
>>>>>>>>>>>>>     rgw gc obj min wait = 300
>>>>>>>>>>>>>     rgw gc processor period = 600
>>>>>>>>>>>>>     rgw init timeout = 180
>>>>>>>>>>>>>     rgw gc processor max time = 600
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I've finally deleted the entire bucket. All 60TB cleared from the
>>>>>>>>> bucket,
>>>>>>>>> the bucket no longer exists.
>>>>>>>>>
>>>>>>>>> Yet running rados ls -p .rgw.buckets | grep '4804.14' still lists
>>>>>>>>> all
>>>>>>>>> the
>>>>>>>>> _shadow_ files that have that buckets prefix.
>>>>>>>>>
>>>>>>>>> Any ideas why these aren't being deleted/cleared up by garbage
>>>>>>>>> collection?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Are there any errors in the log? Can you provide a log (debug rgw =
>>>>>>>> 20, debug ms = 1) of the radosgw through the garbage collection
>>>>>>>> stage?
>>>>>>>>
>>>>>>>> Yehuda
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Yehuda,
>>>>>>>
>>>>>>> Here is a snapshot of the log.
>>>>>>>
>>>>>>> I created a new bucket, uploaded a 25mb file and then deleted it
>>>>>>> i did a gc list --include all and could see the shadow files
>>>>>>>
>>>>>>> I then forced a gc process, and the log has gone crazy with this
>>>>>>> stuff
>>>>>>> in
>>>>>>> it, and it is still going.
>>>>>>>
>>>>>>>
>>>>>>> 2014-12-02 09:07:59.634544 7f86256f3700  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.13:6816/17803 -- ping v1 -- ?+0 0x7f8558000d20 con 0x1e9c470
>>>>>>> 2014-12-02 09:07:59.634578 7f86256f3700  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.12:6830/3830 -- ping v1 -- ?+0 0x7f8558001120 con 0x1e93090
>>>>>>> 2014-12-02 09:07:59.634592 7f86256f3700  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.14:6818/3517 -- ping v1 -- ?+0 0x7f8558001700 con 0x1e9d230
>>>>>>> 2014-12-02 09:07:59.634607 7f86256f3700  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.10:6806/3397 -- ping v1 -- ?+0 0x7f8558006050 con 0x1e9e0e0
>>>>>>> 2014-12-02 09:07:59.634621 7f86256f3700  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.19:6830/4000 -- ping v1 -- ?+0 0x7f8558000a10 con 0x1e9fdc0
>>>>>>> 2014-12-02 09:07:59.634634 7f86256f3700  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.11:6809/3451 -- ping v1 -- ?+0 0x7f8558001d50 con 0x1e9ef10
>>>>>>> 2014-12-02 09:07:59.634647 7f86256f3700  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.19:6803/3449 -- ping v1 -- ?+0 0x7f8558001fe0 con 0x1ea1aa0
>>>>>>> 2014-12-02 09:07:59.634660 7f86256f3700  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.17:6827/14001 -- ping v1 -- ?+0 0x7f85580028c0 con 0x1ea0bf0
>>>>>>> 2014-12-02 09:07:59.637762 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.108 10.150.2.18:6827/13867 158 ==== osd_op_reply(14761 gc.1283
>>>>>>> [call]
>>>>>>> v0'0 uv98856 ondisk = 0) v6 ==== 174+0+11 (1634423764 0 1993775135)
>>>>>>> 0x7f8570006050 con 0x1f1e4d0
>>>>>>> 2014-12-02 09:07:59.637937 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.18:6827/13867 -- osd_op(client.6359480.0:14762 gc.1283 [call
>>>>>>> lock.unlock] 6.10bb9719 ondisk+write e19074) v4 -- ?+0 0x1f30a80 con
>>>>>>> 0x1f1e4d0
>>>>>>> 2014-12-02 09:07:59.652166 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.108 10.150.2.18:6827/13867 159 ==== osd_op_reply(14762 gc.1283
>>>>>>> [call]
>>>>>>> v19074'98857 uv98857 ondisk = 0) v6 ==== 174+0+0 (1358419719 0 0)
>>>>>>> 0x7f8570006050 con 0x1f1e4d0
>>>>>>> 2014-12-02 09:07:59.652293 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.15:6805/1032 -- osd_op(client.6359480.0:14763 gc.1284 [call
>>>>>>> lock.lock] 6.4a5f5aa9 ondisk+write e19074) v4 -- ?+0 0x1f30a80 con
>>>>>>> 0x1f290a0
>>>>>>> 2014-12-02 09:07:59.670108 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.53 10.150.2.15:6805/1032 130 ==== osd_op_reply(14763 gc.1284
>>>>>>> [call]
>>>>>>> v19074'100398 uv100398 ondisk = 0) v6 ==== 174+0+0 (3807569510 0 0)
>>>>>>> 0x7f85b0000b60 con 0x1f290a0
>>>>>>> 2014-12-02 09:07:59.670239 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.15:6805/1032 -- osd_op(client.6359480.0:14764 gc.1284 [call
>>>>>>> rgw.gc_list] 6.4a5f5aa9 ack+read e19074) v4 -- ?+0 0x1f30a80 con
>>>>>>> 0x1f290a0
>>>>>>> 2014-12-02 09:07:59.673871 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.53 10.150.2.15:6805/1032 131 ==== osd_op_reply(14764 gc.1284
>>>>>>> [call]
>>>>>>> v0'0
>>>>>>> uv100398 ondisk = 0) v6 ==== 174+0+11 (4101371043 0 1993775135)
>>>>>>> 0x7f85b0000b60 con 0x1f290a0
>>>>>>> 2014-12-02 09:07:59.674042 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.15:6805/1032 -- osd_op(client.6359480.0:14765 gc.1284 [call
>>>>>>> lock.unlock] 6.4a5f5aa9 ondisk+write e19074) v4 -- ?+0 0x1f30a80 con
>>>>>>> 0x1f290a0
>>>>>>> 2014-12-02 09:07:59.686255 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.53 10.150.2.15:6805/1032 132 ==== osd_op_reply(14765 gc.1284
>>>>>>> [call]
>>>>>>> v19074'100399 uv100399 ondisk = 0) v6 ==== 174+0+0 (2220917153 0 0)
>>>>>>> 0x7f85b0000b60 con 0x1f290a0
>>>>>>> 2014-12-02 09:07:59.686376 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.19:6812/3767 -- osd_op(client.6359480.0:14766 gc.1285 [call
>>>>>>> lock.lock] 6.9f912e07 ondisk+write e19074) v4 -- ?+0 0x1f1f560 con
>>>>>>> 0x1f0ea90
>>>>>>> 2014-12-02 09:07:59.721432 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.83 10.150.2.19:6812/3767 163 ==== osd_op_reply(14766 gc.1285
>>>>>>> [call]
>>>>>>> v19074'302537 uv302537 ondisk = 0) v6 ==== 174+0+0 (2463619648 0 0)
>>>>>>> 0x7f85b8000900 con 0x1f0ea90
>>>>>>> 2014-12-02 09:07:59.721591 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.19:6812/3767 -- osd_op(client.6359480.0:14767 gc.1285 [call
>>>>>>> rgw.gc_list] 6.9f912e07 ack+read e19074) v4 -- ?+0 0x1f30a80 con
>>>>>>> 0x1f0ea90
>>>>>>> 2014-12-02 09:07:59.725699 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.83 10.150.2.19:6812/3767 164 ==== osd_op_reply(14767 gc.1285
>>>>>>> [call]
>>>>>>> v0'0
>>>>>>> uv302537 ondisk = 0) v6 ==== 174+0+11 (3924946656 0 1993775135)
>>>>>>> 0x7f85b8000900 con 0x1f0ea90
>>>>>>> 2014-12-02 09:07:59.725868 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.19:6812/3767 -- osd_op(client.6359480.0:14768 gc.1285 [call
>>>>>>> lock.unlock] 6.9f912e07 ondisk+write e19074) v4 -- ?+0 0x1f1f560 con
>>>>>>> 0x1f0ea90
>>>>>>> 2014-12-02 09:07:59.738287 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.83 10.150.2.19:6812/3767 165 ==== osd_op_reply(14768 gc.1285
>>>>>>> [call]
>>>>>>> v19074'302538 uv302538 ondisk = 0) v6 ==== 174+0+0 (314638285 0 0)
>>>>>>> 0x7f85b8000900 con 0x1f0ea90
>>>>>>> 2014-12-02 09:07:59.738415 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.10:6803/3331 -- osd_op(client.6359480.0:14769 gc.1286 [call
>>>>>>> lock.lock] 6.61beb660 ondisk+write e19074) v4 -- ?+0 0x1f30a80 con
>>>>>>> 0x1f1ce50
>>>>>>> 2014-12-02 09:07:59.764523 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.1
>>>>>>> 10.150.2.10:6803/3331 145 ==== osd_op_reply(14769 gc.1286 [call]
>>>>>>> v19074'136677 uv136677 ondisk = 0) v6 ==== 174+0+0 (3559159328 0 0)
>>>>>>> 0x7f8540002dd0 con 0x1f1ce50
>>>>>>> 2014-12-02 09:07:59.764670 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.10:6803/3331 -- osd_op(client.6359480.0:14770 gc.1286 [call
>>>>>>> rgw.gc_list] 6.61beb660 ack+read e19074) v4 -- ?+0 0x1f1f560 con
>>>>>>> 0x1f1ce50
>>>>>>> 2014-12-02 09:07:59.767836 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.1
>>>>>>> 10.150.2.10:6803/3331 146 ==== osd_op_reply(14770 gc.1286 [call] v0'0
>>>>>>> uv136677 ondisk = 0) v6 ==== 174+0+11 (1532209036 0 1993775135)
>>>>>>> 0x7f8540002dd0 con 0x1f1ce50
>>>>>>> 2014-12-02 09:07:59.768009 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.10:6803/3331 -- osd_op(client.6359480.0:14771 gc.1286 [call
>>>>>>> lock.unlock] 6.61beb660 ondisk+write e19074) v4 -- ?+0 0x1f30a80 con
>>>>>>> 0x1f1ce50
>>>>>>> 2014-12-02 09:07:59.780792 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.1
>>>>>>> 10.150.2.10:6803/3331 147 ==== osd_op_reply(14771 gc.1286 [call]
>>>>>>> v19074'136678 uv136678 ondisk = 0) v6 ==== 174+0+0 (1412649901 0 0)
>>>>>>> 0x7f8540002dd0 con 0x1f1ce50
>>>>>>> 2014-12-02 09:07:59.780968 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.12:6833/4016 -- osd_op(client.6359480.0:14772 gc.1287 [call
>>>>>>> lock.lock] 6.9bfdc387 ondisk+write e19074) v4 -- ?+0 0x1f1f560 con
>>>>>>> 0x1f3b2f0
>>>>>>> 2014-12-02 09:07:59.804154 7f86277f8700  1 -- 10.150.2.2:0/1009990
>>>>>>> <==
>>>>>>> osd.65 10.150.2.12:6833/4016 118 ==== osd_op_reply(14772 gc.1287
>>>>>>> [call]
>>>>>>> v19074'279063 uv279063 ondisk = 0) v6 ==== 174+0+0 (4117862336 0 0)
>>>>>>> 0x7f85d80024b0 con 0x1f3b2f0
>>>>>>> 2014-12-02 09:07:59.804279 7f8630ba57c0  1 -- 10.150.2.2:0/1009990
>>>>>>> -->
>>>>>>> 10.150.2.12:6833/4016 -- osd_op(client.6359480.0:14773 gc.1287 [call
>>>>>>> rgw.gc_list] 6.9bfdc387 ack+read e19074) v4 -- ?+0 0x1f30a80 con
>>>>>>> 0x1f3b2f0
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Can you find the part of the log where it's handling a shard that
>>>>>> actually has an object inside? These are all just iterating over empty
>>>>>> shards.
>>>>>>
>>>>>> Yehuda
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> How can I tell if the shard has an object in it from the logs?
>>>>
>>>>
>>>>
>>>> Search for a different sequence (e.g., search for rgw.gc_remove).
>>>>
>>>> Yehuda
>>>
>>>
>>>
>>>
>>> 0 Results in the logs for rgw.gc_remove
>>
>>
>> Well, something is modifying the gc log. Do you happen to have more
>> than one radosgw running on the same cluster?
>>
>> Yehuda
>
>
> We have 2 radosgw servers
> obj01 and obj02

Are both of them pointing at the same zone?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com