Re: radosgw octopus - how to cleanup orphan multipart uploads

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am currently going over all our buckets, which takes some time:
# for BUCKET in `radosgw-admin bucket stats | jq -r '.[] | .bucket'`;
do radosgw-admin
bi list --bucket ${BUCKET} | jq -r '.[] | select(.idx? |
match("_multipart.*")) | .idx + ", " + .entry.meta.mtime' >
${BUCKET}.multiparts done

And one of the files looks like this:
# cat private-images-70d7e202-97bd-451a-92a0-67638a378a7e.multiparts
_multipart_8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365,
2022-08-30T14:20:36.880045Z

# radosgw-admin bi list --bucket
private-images-70d7e202-97bd-451a-92a0-67638a378a7e | jq -r '.[] | .idx'
8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4
_multipart_8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365
a73ff5a1-0712-4d5e-b41c-36d73ec7897d.lz4

`Three files in the bucket, one orphan multipart. Great for testing. But
now comes this:
# radosgw-admin bucket radoslist --bucket
private-images-70d7e202-97bd-451a-92a0-67638a378a7e | grep
8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365
ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2339856956.63__multipart_8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365
ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2339856956.63__shadow_8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365_1
ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2339856956.63__multipart_8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365

How can one radosobject be twice in the listing? I thought it might be a
display issue, but I checked with
# for i in `radosgw-admin bucket radoslist --bucket
 private-images-70d7e202-97bd-451a-92a0-67638a378a7e | grep
8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365`;
do echo "--${i}--"; done
--ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2339856956.63__multipart_8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365--
--ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2339856956.63__shadow_8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365_1--
--ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2339856956.63__multipart_8cfd0bdb-05f9-40cd-a50d-83295b416ea9.lz4.CwlAWozuCYXDKYvhkW5RiZUxuaNfu48C.365--



Am Fr., 2. Dez. 2022 um 12:17 Uhr schrieb Boris Behrens <bb@xxxxxxxxx>:

> Hi,
> we are currently encountering a lot of broken / orphan multipart uploads.
>
> When I try to fetch the multipart uploads via s3cmd, it just never
> finishes.
> Debug output looks like this and it basically never changes.
>
> DEBUG: signature-v4 headers: {'x-amz-date': '20221202T105838Z',
> 'Authorization': 'XXX', 'x-amz-content-sha256': 'XXX'}
> DEBUG: Processing request, please wait...
> DEBUG: get_hostname(BUCKET): BUCKET.TLD
> DEBUG: ConnMan.get(): re-using connection: https://BUCKET.TLD#48
> DEBUG: format_uri():
> /?KeyMarker=FILE&UploadIdMarker=2~nWhBa1z7eJG_7oUw-GrWZTT0CqeNdUJ&uploads
> DEBUG: Sending request method_string='GET',
> uri='/?KeyMarker=FILE&UploadIdMarker=2~nWhBa1z7eJG_7oUw-GrWZTT0CqeNdUJ&uploads',
> headers={'x-amz-date': '20221202T105838Z', 'Authorization': 'XXX',
> 'x-amz-content-sha256': 'XXX'}, body=(0 bytes)
> DEBUG: ConnMan.put(): connection put back to pool (https://BUCKET.TLD#49)
> DEBUG: Response:
> {'data': b'<?xml version="1.0"
> encoding="UTF-8"?><ListMultipartUploadsResult xm'
>          b'lns="http://s3.amazonaws.com/doc/2006-03-01/";><Bucket>BUCKET'
>          b'BUCKET</Bucket><NextKeyMarker>743ff'
>
>  b'64d-3ad6-4e3c-8816-4d8a81264657.lz4</NextKeyMarker><NextUploadIdMark'
>
>  b'er>2~nWhBa1z7eJG_7oUw-GrWZTT0CqeNdUJ</NextUploadIdMarker><MaxUploads'
>
>  b'>1000</MaxUploads><IsTruncated>true</IsTruncated><Upload><Key>FILE'
>          b'FILE</Key><UploadId>2~i5e5YWnSM8tGpF_a'
>
>  b'fEkr19EBzn4x09b</UploadId><Initiator><ID>USERID</ID><DisplayName>'
>          b'USERID</DisplayName></Initiator><Owner><ID>USERID</ID><DisplayN'
>
>  b'ame>USERID</DisplayName></Owner><StorageClass>STANDARD</StorageCla'
>
>  b'ss><Initiated>2022-11-27T08:48:36.037Z</Initiated></Upload><Upload><'
>          b'Key>FILE</Key><UploadId>2~nWhBa1'
>
>  b'z7eJG_7oUw-GrWZTT0CqeNdUJ</UploadId><Initiator><ID>USERID</ID><Di'
>          b'splayName>USERID</DisplayName></Initiator><Owner><ID>USERID</ID'
>
>  b'><DisplayName>USERID</DisplayName></Owner><StorageClass>STANDARD</'
>
>  b'StorageClass><Initiated>2022-11-28T04:38:44.781Z</Initiated></Upload'
>          b'></ListMultipartUploadsResult>'
>
>
> Listing all files in the bucket show all files I would expect. Basically
> all finished multipart uploads. In this case there are 30 files in the
> bucket.
> But then I check the bucket stats there 16518 and listing the bucket index
> show a huge ton on _multipart_ files, that are very old.
>
> I did a cross reference with another bucket and it only shows the actual
> files in num_objects (like 7 object with a total size of 32GB)
>
> How do I clean them up?
> Last time I removed the rados objects, then I removed the omapkeys from
> the index, and after that I was able to run bucket check --check-objects
> --fix on that bucket.
> I also tried to remove the object with radosgw-admin object rm --bucket
> BUCKET --object=_multipart.... but this does not work either.
>
> Is there a way to do it easier?
> I would love to not fix 100k objects by hand.
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux