Re: Radosgw bucket check fix doesn't do anything

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Reid, 

I see. It seems weird that the --fix command output shows no differences between existing_header and calculated_header after it cleaned up some index entries (removing manifest part from index). 
Have you tried running the stats command again to see if any figures were updated? Based on this documentation [1], stats should have been updated with those calculated. But that's a shot in the dark, I don't have much experience with this tool. 

Now since the rgw.multimeta.num_objects (which is related to multipart upload parts, iirc) is nowhere near 17k, you could try and list all objects in the bucket index and compare it to the rados list of the bucket objects. 
fgrep -f command is good at this. 

Maybe this will provide you with useful information about the situation. 

Regards, 
Frédéric. 

[1] https://www.ibm.com/docs/en/storage-ceph/7?topic=management-managing-bucket-index-entries 

----- Le 18 Sep 24, à 20:39, Reid Guyett <reid.guyett@xxxxxxxxx> a écrit : 

> Hi Frederic,
> Thanks for those notes.

> When I scan the list of multiparts, I do not see the items from the invalid
> multipart list. Example:
> The first entry here

>> radosgw-admin bucket check --bucket mimir-prod | head
>> {
>> "invalid_multipart_entries": [
>> "_multipart_network/01H9CFRA45MJWBHQRCHRR4JHV4/index.sJRTCoqiZvlge2cjz6gLU7DwuLI468zo.2",
>> ...

> does not appear in the abort-multipart-upload.txt I get from the
> list-multipart-uploads

>> $ grep -c 01H9CFRA45MJWBHQRCHRR4JHV4 abort-multipart-upload.txt
>> 0

> If I try to abort the invalid multipart, it says it does not exist.

>> $ aws --profile mimir-prod --endpoint-url [ https://my.objectstorage.domain/ |
>> https://my.objectstorage.domain ] s3api abort-multipart-upload --bucket
>> mimir-prod --key "network/01H9CFRA45MJWBHQRCHRR4JHV4/index" --upload-id
>> "sJRTCoqiZvlge2cjz6gLU7DwuLI468zo.2"

>> An error occurred (NoSuchUpload) when calling the AbortMultipartUpload
>> operation: Unknown
> I seem to have many buckets with this type of state. I'm hoping to be able to
> fix them.

> Thanks!

> On Wed, Sep 18, 2024 at 4:21 AM Frédéric Nass < [
> mailto:frederic.nass@xxxxxxxxxxxxxxxx | frederic.nass@xxxxxxxxxxxxxxxx ] >
> wrote:

>> Hi Reid,

>> The bucket check --fix will not clean up aborted multipart uploads. An S3 client
>> will.

>> You need to either set a Lifecycle policy on buckets to have these cleaned up
>> automatically after some time

>> ~/ cat /home/lifecycle.xml
>> <LifecycleConfiguration>
>> <Rule>
>> <AbortIncompleteMultipartUpload>
>> <DaysAfterInitiation>3</DaysAfterInitiation>
>> </AbortIncompleteMultipartUpload>
>> <Prefix></Prefix>
>> <Status>Enabled</Status>
>> </Rule>
>> </LifecycleConfiguration>

>> ~/ s3cmd setlifecycle lifecycle.xml s3://bucket-test

>> Or get rid of them manually by using an s3 client

>> ~/ aws s3api list-multipart-uploads --endpoint= [
>> https://my.objectstorage.domain/ | https://my.objectstorage.domain ] --bucket
>> mimir-prod | jq -r '.Uploads[] | "--key \"\(.Key)\" --upload-id \(.UploadId)"'
>> > abort-multipart-upload.txt

>> ~/ max=$(cat abort-multipart-upload.txt | wc -l); i=1; while read -r line; do
>> echo -n "$i/$max"; ((i=i+1)); eval "aws s3api abort-multipart-upload
>> --endpoint= [ https://my.objectstorage.domain/ |
>> https://my.objectstorage.domain ] --bucket mimir-prod $line"; done <
>> abort-multipart-upload.txt

>> Regards,
>> Frédéric.

>> ----- Le 17 Sep 24, à 14:27, Reid Guyett [ mailto:reid.guyett@xxxxxxxxx |
>> reid.guyett@xxxxxxxxx ] a écrit :

>> > Hello,

>> > I recently moved a bucket from 1 cluster to another cluster using rclone. I
>> > noticed that the source bucket had around 35k objects and the destination
>> > bucket only had around 18k objects after the sync was completed.

>> > Source bucket stats showed:

>> >> radosgw-admin bucket stats --bucket mimir-prod | jq .usage
>> >> {
>> >> "rgw.main": {
>> >> "size": 4321515978174,
>> >> "size_actual": 4321552605184,
>> >> "size_utilized": 4321515978174,
>> >> "size_kb": 4220230448,
>> >> "size_kb_actual": 4220266216,
>> >> "size_kb_utilized": 4220230448,
>> >> "num_objects": 35470
>> >> },
>> >> "rgw.multimeta": {
>> >> "size": 0,
>> >> "size_actual": 0,
>> >> "size_utilized": 66609,
>> >> "size_kb": 0,
>> >> "size_kb_actual": 0,
>> >> "size_kb_utilized": 66,
>> >> "num_objects": 2467
>> >> }
>> >> }

>> > Destination bucket stats showed:

>> >> radosgw-admin bucket stats --bucket mimir-prod | jq .usage
>> >> {
>> >> "rgw.main": {
>> >> "size": 4068176326491,
>> >> "size_actual": 4068212576256,
>> >> "size_utilized": 4068176326491,
>> >> "size_kb": 3972828444,
>> >> "size_kb_actual": 3972863844,
>> >> "size_kb_utilized": 3972828444,
>> >> "num_objects": 18525
>> >> },
>> >> "rgw.multimeta": {
>> >> "size": 0,
>> >> "size_actual": 0,
>> >> "size_utilized": 108,
>> >> "size_kb": 0,
>> >> "size_kb_actual": 0,
>> >> "size_kb_utilized": 1,
>> >> "num_objects": 4
>> >> }
>> >> }

>> > When I checked the source bucket using aws cli tool it showed around 18k
>> > objects. The bucket was actively being used so the 18k is slightly
>> > different.

>>>> aws --profile mimir-prod --endpoint-url [ https://my.objectstorage.domain/ |
>> >> https://my.objectstorage.domain ]
>> >> s3api list-objects --bucket mimir-prod > mimir_objs
>> >> cat mimir_objs | grep -c "Key"
>> >> 18090

>> > I did a check on the source bucket and it showed a lot of invalid
>> > multipart objects.

>> >> radosgw-admin bucket check --bucket mimir-prod | head
>> >> {
>> >> "invalid_multipart_entries": [

>> >> "_multipart_network/01H9CFRA45MJWBHQRCHRR4JHV4/index.sJRTCoqiZvlge2cjz6gLU7DwuLI468zo.2",

>> >> "_multipart_network/01HMCCRMTC5F4BFCZ56BKHTMWQ/index.6ypGbeMr6Jg3y7xAL8yrLL-v4sbFzjSA.3",

>> >> "_multipart_network/01HMFKR56RRZNX9VT9B4F49MMD/chunks/000001.JIC7fFA_q96nal1yGXsVSPCY8EMe5AU8.2",

>> >> "_multipart_network/01HMFKSND2E5BWF6QVTX8SDRRQ/index.57aSNeXn3j70H4EHfbNCD2RpoOp-P1Bv.2",

>> >> "_multipart_network/01HMFKTDNA3FVSWW7N8KYY2C7N/chunks/000001.2~kRjRbLWWDf1e40P40LUzdU3f_x2P46Q.2",

>> >> "_multipart_network/01HMFTMA8J1DEXYHKMVCXCC0GM/chunks/000001.GVajdCja0gHOLlgyFanF72A4B6ZqUpu5.2",

>> >> "_multipart_network/01HMFTMA8J1DEXYHKMVCXCC0GM/chunks/000001.GYaouEePvEdbQosCb5jLFCAHrSm9VoDh.2",

>> >> "_multipart_network/01HMFTMA8J1DEXYHKMVCXCC0GM/chunks/000001.r4HkP-JK-rBAWDoXBXKJJYEAjk39AswW.1",
>> >> ...

>> > So I tried to run `radosgw-admin bucket check --check-objects --bucket
>> > mimir-prod --fix` and it showed that it was cleaning things with thousands
>> > of lines like

>> >> 2024-09-17T12:19:42.212+0000 7fea25b6f9c0 0 check_disk_state(): removing
>> >> manifest part from index:
>> >> mimir-prod:_multipart_tenant_prod/01J7Q778YXJXE23SRQZM9ZA4NH/chunks/000001.2~m6EI5fHFWxI-RmWB6TeFSupu7vVrCgh.2
>> >> 2024-09-17T12:19:42.212+0000 7fea25b6f9c0 0 check_disk_state(): removing
>> >> manifest part from index:
>> >> mimir-prod:_multipart_tenant_prod/01J7Q778YXJXE23SRQZM9ZA4NH/chunks/000001.2~m6EI5fHFWxI-RmWB6TeFSupu7vVrCgh.3
>> >> 2024-09-17T12:19:42.212+0000 7fea25b6f9c0 0 check_disk_state(): removing
>> >> manifest part from index:
>> >> mimir-prod:_multipart_tenant_prod/01J7Q778YXJXE23SRQZM9ZA4NH/chunks/000001.2~m6EI5fHFWxI-RmWB6TeFSupu7vVrCgh.4
>> >> 2024-09-17T12:19:42.212+0000 7fea25b6f9c0 0 check_disk_state(): removing
>> >> manifest part from index:
>> >> mimir-prod:_multipart_tenant_prod/01J7Q778YXJXE23SRQZM9ZA4NH/chunks/000001.2~m6EI5fHFWxI-RmWB6TeFSupu7vVrCgh.5
>> >> 2024-09-17T12:19:42.213+0000 7fea25b6f9c0 0 check_disk_state(): removing
>> >> manifest part from index:
>> >> mimir-prod:_multipart_tenant_prod/01J7Q778YXJXE23SRQZM9ZA4NH/chunks/000001.2~m6EI5fHFWxI-RmWB6TeFSupu7vVrCgh.6

>> > but the end result shows nothing has changed.

>> >> "check_result": {
>> >> "existing_header": {
>> >> "usage": {
>> >> "rgw.main": {
>> >> "size": 4281119287051,
>> >> "size_actual": 4281159110656,
>> >> "size_utilized": 4281119287051,
>> >> "size_kb": 4180780554,
>> >> "size_kb_actual": 4180819444,
>> >> "size_kb_utilized": 4180780554,
>> >> "num_objects": 36429
>> >> },
>> >> "rgw.multimeta": {
>> >> "size": 0,
>> >> "size_actual": 0,
>> >> "size_utilized": 66636,
>> >> "size_kb": 0,
>> >> "size_kb_actual": 0,
>> >> "size_kb_utilized": 66,
>> >> "num_objects": 2468
>> >> }
>> >> }
>> >> },
>> >> "calculated_header": {
>> >> "usage": {
>> >> "rgw.main": {
>> >> "size": 4281119287051,
>> >> "size_actual": 4281159110656,
>> >> "size_utilized": 4281119287051,
>> >> "size_kb": 4180780554,
>> >> "size_kb_actual": 4180819444,
>> >> "size_kb_utilized": 4180780554,
>> >> "num_objects": 36429
>> >> },
>> >> "rgw.multimeta": {
>> >> "size": 0,
>> >> "size_actual": 0,
>> >> "size_utilized": 66636,
>> >> "size_kb": 0,
>> >> "size_kb_actual": 0,
>> >> "size_kb_utilized": 66,
>> >> "num_objects": 2468
>> >> }
>> >> }
>> >> }
>> >> }

>> > Does this command do anything? Is it the wrong command for this issue? How
>> > does one go about fixing buckets in this state?

>> > Thanks!

>> > Reid
>> > _______________________________________________
>> > ceph-users mailing list -- [ mailto:ceph-users@xxxxxxx | ceph-users@xxxxxxx ]
>>> To unsubscribe send an email to [ mailto:ceph-users-leave@xxxxxxx |
>> > ceph-users-leave@xxxxxxx ]
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux