Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

Boris Behrens <bb@xxxxxxxxx> · Mon, 8 Nov 2021 12:45:39 +0100

Maybe I missed it, but can't I just reshad buckets when they are not
replicated / synced / mirrord (what is the correct ceph terminology in
this)?

Am Mo., 8. Nov. 2021 um 12:28 Uhr schrieb mhnx <morphinwithyou@xxxxxxxxx>:

> (There should not be any issues using rgw for other buckets while
> re-sharding.)
> If it is then disabling the bucket access will work right? Also sync
> should be disabled.
>
> Yes, after the manual reshard it should clear the leftovers but in my
> situation resharding failed and I got double entries for that bucket.
> I didn't push further, instead I divide the bucket to new buckets and
> reduce object count with a new bucket tree. Copied all of the objects with
> rclone and started bucket remove "radosgw-admin bucket rm --bucket=mybucket
> --bypass-gc --purge-objects --max-concurrent-ios=128" it has been very very
> long time "started at Sep08" and it is still working. There was 250M
> objects in that bucket and after the manual reshard faiI I got 500M object
> count when I check with bucket stats num_objects. Now I have;
> "size_kb": 10648067645,
> "num_objects": 132270190
>
> Remove speed is 50-60 objects in a second. It's not because of the cluster
> speed. Cluster is fine.
> I have space so I let it go. When I see stable object count I will stop
> the remove process and start again with the
> " --inconsistent-index" parameter.
> I wonder is it safe to use the parameter with referenced objects? I want
> to learn how "--inconsistent-index" works and what it does.
>
> Сергей Процун <prosergey07@xxxxxxxxx>, 5 Kas 2021 Cum, 17:46 tarihinde
> şunu yazdı:
>
>> There should not be any issues using rgw for other buckets while
>> re-sharding.
>>
>> As for doubling number of objects after reshard is an interesting
>> situation. After the manual reshard is done, there might be leftover from
>> the old bucket index. As during reshard new .dir.new_bucket_index objects
>> are created. They contain all data related to the objects which are stored
>> in buckets.data pool. Just wondering if the issue with the doubled number
>> of objects was related to old bucket index. If so its save to delete old
>> bucket index.
>>
>>  In the perfect world, it would be ideal to know the eventoal number of
>> objects inside the bucket and set number of shards to the corresponding
>> setting initially.
>>
>>  In the real world when the client re-purpose the usage of the bucket, we
>> have to deal with reshards.
>>
>> пт, 5 лист. 2021, 14:43 користувач mhnx <morphinwithyou@xxxxxxxxx> пише:
>>
>>> I also use this method and I hate it.
>>>
>>> Stopping all of the RGW clients is never an option! It shouldn't be.
>>> Sharding is hell. I was have 250M objects in a bucket and reshard failed
>>> after 2days and object count doubled somehow! 2 days of downtime is not
>>> an
>>> option.
>>>
>>> I wonder if I stop the write-read on a bucket and while resharding it is
>>> there any problem of using RGW's with all other buckets?
>>>
>>> Nowadays I advise splitting buckets as much as you can! That means
>>> changing
>>> your apps directory tree but this design requires it.
>>> You need to plan object count at least for 5 years and create ones.
>>> Usually I use 101 shards which means 10.100.000 objects.
>>> Also If I need to use versioning I use 2x101 or 3x101 because versions
>>> are
>>> hard to predict. You need to predict how many versions you need and set a
>>> lifecycle even before using the bucket!
>>> The max shard that I use 1999. I'm not happy about it but sometimes you
>>> gotta do what you need to do.
>>> Fighting with customers is not an option, you can only advise changing
>>> their apps folder tree but I've never seen someone accept the deal
>>> without
>>> arguing.
>>>
>>> My offers usually like this:
>>> 1- Core files bucket: no need to change or very limited changes.
>>> "calculate
>>> the object count and multiply with 2"
>>> 2- Hot data bucket: There will be daily changes and versioning.
>>> "calculate
>>> the object count and multiply with 3"
>>> 3- Cold data bucket[s]: There will be no daily changes. You should open
>>> new
>>> buckets every Year or Month. This is good to keep it clean and steady. No
>>> need for versioning and Multisite Will not suffer due to barely changes.
>>> 4- Temp files bucket[s]: This is so important. If you're crawling
>>> millions
>>> of millions objects everyday and delete it at the end of the week or
>>> month
>>> then you should definitely use a temp bucket.  No versioning, No
>>> multisite,
>>> No index if it's possible.
>>>
>>>
>>>
>>> Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>, 5 Kas 2021 Cum, 12:30
>>> tarihinde şunu yazdı:
>>>
>>> > You mean prepare or reshard?
>>> > Prepare:
>>> > I collect as much information for the users before onboarding so I can
>>> > prepare for their use case in the future and set things up.
>>> >
>>> > Preshard:
>>> > After created the bucket:
>>> > radosgw-admin bucket reshard --bucket=ex-bucket --num-shards=101
>>> >
>>> > Also when you shard the buckets, you need to use prime numbers.
>>> >
>>> > Istvan Szabo
>>> > Senior Infrastructure Engineer
>>> > ---------------------------------------------------
>>> > Agoda Services Co., Ltd.
>>> > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
>>> > ---------------------------------------------------
>>> >
>>> > From: Boris Behrens <bb@xxxxxxxxx>
>>> > Sent: Friday, November 5, 2021 4:22 PM
>>> > To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>; ceph-users@xxxxxxx
>>> > Subject: Re:  large bucket index in multisite environement
>>> > (how to deal with large omap objects warning)?
>>> >
>>> > Email received from the internet. If in doubt, don't click any link nor
>>> > open any attachment !
>>> > ________________________________
>>> > Cheers Istvan,
>>> >
>>> > how do you do this?
>>> >
>>> > Am Do., 4. Nov. 2021 um 19:45 Uhr schrieb Szabo, Istvan (Agoda) <
>>> > Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx>>:
>>> > This one you need to prepare, you beed to preshard the bucket which you
>>> > know that will hold more than millions of objects.
>>> >
>>> > I have a bucket where we store 1.2 billions of objects with 24xxx
>>> shard.
>>> > No omap issue.
>>> > Istvan Szabo
>>> > Senior Infrastructure Engineer
>>> > ---------------------------------------------------
>>> > Agoda Services Co., Ltd.
>>> > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
>>> > ---------------------------------------------------
>>> >
>>> >
>>> >
>>> > --
>>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
>>> im
>>> > groÃƒ¼en Saal.
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> >
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx