Re: How to remove a faulty bucket?

David Turner <drakonstein@xxxxxxxxx> · Fri, 08 Dec 2017 15:45:53 +0000

You're correct, I was mistaken that a bucket could be renamed. How many buckets do you have in your RGW? I got away from the buckets that wouldn't delete by recreating the ceph pool for the data since it was only backup data at the time.

On Fri, Dec 8, 2017 at 10:10 AM Martin Emrich <martin.emrich@xxxxxxxxxxx> wrote:
Hi!

I found no way to rename the bucket. Neither s3cmd nor radosgw-admin offer a renaming option (even Amazon S3 does not support renaming).

Deleting the objects did not work:

# s3cmd rb s3://bucket -r

WARNING: Bucket is not empty. Removing all the objects from it first. This may take some time...

WARNING: Remote list is empty.

ERROR: S3 error: 409 (BucketNotEmpty)

Listing it shows it as empty, but in the backend, there are still objects, they are just not visible to S3 clients.

The garbage collection is already listed as empty.

Regards,

Martin

Von: David Turner <drakonstein@xxxxxxxxx>

Datum: Freitag, 8. Dezember 2017 um 15:19

An: Martin Emrich <martin.emrich@xxxxxxxxxxx>

Cc: ceph-users <ceph-users@xxxxxxxxxxxxxx>

Betreff: Re:  How to remove a faulty bucket? [WAS:Re: Resharding issues / How long does it take?]

First off, you can rename a bucket and create a new one for the application to use. You can also unlink the bucket so it is no longer owned by the access-key/user that created it. That should get your application back on its feet.

I have had very little success with bypass-gc, although I think it would be a wonderful feature of it worked. After you move the bucket to a different name, you could try a multi-threaded python script to delete all of the objects in the bucket and then removed the bucket... Maybe. I had a bucket that still didn't remove after doing that after it failed to delete with bypass-gc and such. At that point though, it took up little enough space that I could ignore it myself. Watch your GC queue, though, and make sure it's going down.

On Fri, Dec 8, 2017, 6:00 AM Martin Emrich <mailto:martin.emrich@xxxxxxxxxxx> wrote:

Followup:

I eventually gave up trying to salvage the bucket. The bucket is supposed to have ca. 110000 objects, every attempt to "bucket index check --fix" increased that number by 110000, so something is very wrong.

Also, deleting the bucket with "radosgw bucket rm --purge-objects" failed with a "no such file or directory" error.

Even the biggest shovel I found could not remove the bucket:

# radosgw-admin bucket rm --bucket=XXXX --purge-objects --inconsistent-index --yes-i-really-mean-it --bypass-gc

2017-12-08 11:56:15.020617 7f799c326c40 -1 ERROR: could not drain handles as aio completion returned with -2

2017-12-08 11:56:16.879316 7f799c326c40 -1 ERROR: unable to remove bucket(2) No such file or directory

As the application relies on the bucket name, which is now occupied by this mystery bucket, I seem to be stuck. How can I remove this bucket?

Thanks

Martin

Am 07.12.17, 16:05 schrieb "ceph-users im Auftrag von Martin Emrich" <mailto:ceph-users-bounces@xxxxxxxxxxxxxx im Auftrag von mailto:martin.emrich@xxxxxxxxxxx>:

    Hi all!

    Apparently, one of my buckets went wonko during automatic resharding, the frontend application only gehts a timeout after 90s.

    After an attempt to fix the index using “radosgw-admin bucket check –fix”, I tried to reshard id (6,3GB of data in ca. 230000 objects).

    The resharding command is now running for over an hour. No significant load on any of the 18 OSDs, the host running radosgw-admin or on one of the three radosgw hosts. The OSDs are beefy machines with HDDs for Data and SSDs for index pools. Running 12.2.2.

    How long should the resharding take? For a few minutes, radosgw-admin seems quite busy, but now it seems to only sit there at only a few % of CPU usage.

    “radosgw-admin reshard list” reports an empty list. Reshard status reports

    [

        {

            "reshard_status": 1,

            "new_bucket_instance_id": "c2ffcb0f-a9a3-4360-a9be-5edef965449a.6860125.1",

            "num_shards": 10

        }

    ]

    I have a feeling that the bucket index is still damaged/incomplete/inconsistent. What does the message

    *** NOTICE: operation will not remove old bucket index objects ***

    ***         these will need to be removed manually             ***

    mean? How can I clean up manually?

    Thanks,

    Martin

    _______________________________________________

    ceph-users mailing list

    mailto:ceph-users@xxxxxxxxxxxxxx

    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

mailto:ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com