Re: Undo "radosgw-admin bi purge"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Everything you say is to be expected. I was not aware `reshard` could be run when the prior shards are removed, but apparently it can, and it creates new bucket index shards that are empty. Normally `reshard` reads entries from the old shards and copies their data to the new shards but since the old shards no longer exist, there’s nothing to copy over. But I presume the reason the reshard was suggested by other respondents was to allow for a bucket removal, which you verified.

You’re correct in that the objects in the data pool still exist. To list those you could run `rgw-orphan-list`. It will output objects in the data pool that are not referenced by any bucket index. Note: for large clusters it can take a while to run. If after reviewing the list of objects you believe (have verified) they’re not used, you can then remove them via `rados` commands. rgw-orphan-list is still considered experimental, but it has successfully helped clean up large clusters.

You also asked why there’s not a command to scan the data pool and recreate the bucket index. I think the concept would work as all head objects include the bucket marker in their names. There might be some corner cases where it’d partially fail, such as (possibly) transactional changes that were underway when the bucket index was purged. And there is metadata in the bucket index that’s not stored in the objects, so it would have to be recreated somehow. But no one has written it yet.

Eric
(he/him)

> On Feb 22, 2023, at 11:04 AM, Robert Sander <r.sander@xxxxxxxxxxxxxxxxxxx> wrote:
> 
> On 22.02.23 14:42, David Orman wrote:
>> If it's a test cluster, you could try:
>> root@ceph01:/# radosgw-admin bucket check -h |grep -A1 check-objects
>>    --check-objects           bucket check: rebuilds bucket index according to
>>                              actual objects state
> 
> After a "bi purge" a "bucket check" returns an error:
> 
> # radosgw-admin bi purge --bucket=testbucket --yes-i-really-mean-it
> # radosgw-admin bi list --bucket=testbucket
> ERROR: bi_list(): (2) No such file or directory
> # radosgw-admin bucket check --bucket=testbucket --check-objects
> 2023-02-22T16:51:11.970+0100 7fdcc6093e40  0 int RGWRados::cls_bucket_list_ordered(const DoutPrefixProvider*, RGWBucketInfo&, int, const rgw_obj_index_key&, const string&, const string&, uint32_t, bool, uint16_t, RGWRados::ent_map_t&, bool*, bool*, rgw_obj_index_key*, optional_yield, RGWBucketListNameFilter): CLSRGWIssueBucketList for :testbucket[471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1]) failed
> 
> Adding --fix does not change anything.
> 
> I can still download the one S3 object I put in the bucket
> because I know its name, but:
> 
> # s3cmd ls s3://testbucket/
> ERROR: S3 error: 404 (NoSuchKey)
> 
> A "bucket reshard" recreates index objects:
> 
> # radosgw-admin bucket reshard --bucket=testbucket --num-shards=12
> tenant:
> bucket name: testbucket
> old bucket instance id: 471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1
> new bucket instance id: 471f26a3-ff89-4b02-911a-0c89e2e295fa.105128491.1
> total entries: 0
> 2023-02-22T16:58:34.496+0100 7f52360dce40  1 execute INFO: reshard of bucket "testbucket" from "testbucket:471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1" to "testbucket:471f26a3-ff89-4b02-911a-0c89e2e295fa.105128491.1" completed successfully
> 
> After that "bucket check" runs without error but cannot
> fix the situation:
> 
> # radosgw-admin bucket check --bucket=testbucket --check-objects --fix
> []
> {}
> {
>    "existing_header": {
>        "usage": {}
>    },
>    "calculated_header": {
>        "usage": {}
>    }
> }
> 
> "s3cmd ls s3://testbucket/" shows nothing.
> 
> "s3cmd rb s3://testbucket/" removes the bucket but the RADOS
> objects of the S3 objects remain in the data pool.
> 
> Regards
> -- 
> Robert Sander
> Heinlein Consulting GmbH
> Schwedter Str. 8/9b, 10119 Berlin
> 
> https://www.heinlein-support.de
> 
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
> 
> Amtsgericht Berlin-Charlottenburg - HRB 220009 B
> Geschäftsführer: Peer Heinlein - Sitz: Berlin
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux