Not replicated buckets you can reshard but better to confirm other people also. Istvan Szabo Senior Infrastructure Engineer --------------------------------------------------- Agoda Services Co., Ltd. e: istvan.szabo@xxxxxxxxx --------------------------------------------------- -----Original Message----- From: Boris Behrens <bb@xxxxxxxxx> Sent: Monday, November 8, 2021 6:46 PM To: Ceph Users <ceph-users@xxxxxxx> Subject: Re: large bucket index in multisite environement (how to deal with large omap objects warning)? Email received from the internet. If in doubt, don't click any link nor open any attachment ! ________________________________ Maybe I missed it, but can't I just reshad buckets when they are not replicated / synced / mirrord (what is the correct ceph terminology in this)? Am Mo., 8. Nov. 2021 um 12:28 Uhr schrieb mhnx <morphinwithyou@xxxxxxxxx>: > (There should not be any issues using rgw for other buckets while > re-sharding.) > If it is then disabling the bucket access will work right? Also sync > should be disabled. > > Yes, after the manual reshard it should clear the leftovers but in my > situation resharding failed and I got double entries for that bucket. > I didn't push further, instead I divide the bucket to new buckets and > reduce object count with a new bucket tree. Copied all of the objects > with rclone and started bucket remove "radosgw-admin bucket rm > --bucket=mybucket --bypass-gc --purge-objects > --max-concurrent-ios=128" it has been very very long time "started at > Sep08" and it is still working. There was 250M objects in that bucket > and after the manual reshard faiI I got 500M object count when I check > with bucket stats num_objects. Now I have; > "size_kb": 10648067645, > "num_objects": 132270190 > > Remove speed is 50-60 objects in a second. It's not because of the > cluster speed. Cluster is fine. > I have space so I let it go. When I see stable object count I will > stop the remove process and start again with the " > --inconsistent-index" parameter. > I wonder is it safe to use the parameter with referenced objects? I > want to learn how "--inconsistent-index" works and what it does. > > Сергей Процун <prosergey07@xxxxxxxxx>, 5 Kas 2021 Cum, 17:46 tarihinde > şunu yazdı: > >> There should not be any issues using rgw for other buckets while >> re-sharding. >> >> As for doubling number of objects after reshard is an interesting >> situation. After the manual reshard is done, there might be leftover >> from the old bucket index. As during reshard new >> .dir.new_bucket_index objects are created. They contain all data >> related to the objects which are stored in buckets.data pool. Just >> wondering if the issue with the doubled number of objects was related >> to old bucket index. If so its save to delete old bucket index. >> >> In the perfect world, it would be ideal to know the eventoal number >> of objects inside the bucket and set number of shards to the >> corresponding setting initially. >> >> In the real world when the client re-purpose the usage of the >> bucket, we have to deal with reshards. >> >> пт, 5 лист. 2021, 14:43 користувач mhnx <morphinwithyou@xxxxxxxxx> пише: >> >>> I also use this method and I hate it. >>> >>> Stopping all of the RGW clients is never an option! It shouldn't be. >>> Sharding is hell. I was have 250M objects in a bucket and reshard >>> failed after 2days and object count doubled somehow! 2 days of >>> downtime is not an option. >>> >>> I wonder if I stop the write-read on a bucket and while resharding >>> it is there any problem of using RGW's with all other buckets? >>> >>> Nowadays I advise splitting buckets as much as you can! That means >>> changing your apps directory tree but this design requires it. >>> You need to plan object count at least for 5 years and create ones. >>> Usually I use 101 shards which means 10.100.000 objects. >>> Also If I need to use versioning I use 2x101 or 3x101 because >>> versions are hard to predict. You need to predict how many versions >>> you need and set a lifecycle even before using the bucket! >>> The max shard that I use 1999. I'm not happy about it but sometimes >>> you gotta do what you need to do. >>> Fighting with customers is not an option, you can only advise >>> changing their apps folder tree but I've never seen someone accept >>> the deal without arguing. >>> >>> My offers usually like this: >>> 1- Core files bucket: no need to change or very limited changes. >>> "calculate >>> the object count and multiply with 2" >>> 2- Hot data bucket: There will be daily changes and versioning. >>> "calculate >>> the object count and multiply with 3" >>> 3- Cold data bucket[s]: There will be no daily changes. You should >>> open new buckets every Year or Month. This is good to keep it clean >>> and steady. No need for versioning and Multisite Will not suffer due >>> to barely changes. >>> 4- Temp files bucket[s]: This is so important. If you're crawling >>> millions of millions objects everyday and delete it at the end of >>> the week or month then you should definitely use a temp bucket. No >>> versioning, No multisite, No index if it's possible. >>> >>> >>> >>> Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>, 5 Kas 2021 Cum, >>> 12:30 tarihinde şunu yazdı: >>> >>> > You mean prepare or reshard? >>> > Prepare: >>> > I collect as much information for the users before onboarding so I >>> > can prepare for their use case in the future and set things up. >>> > >>> > Preshard: >>> > After created the bucket: >>> > radosgw-admin bucket reshard --bucket=ex-bucket --num-shards=101 >>> > >>> > Also when you shard the buckets, you need to use prime numbers. >>> > >>> > Istvan Szabo >>> > Senior Infrastructure Engineer >>> > --------------------------------------------------- >>> > Agoda Services Co., Ltd. >>> > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx> >>> > --------------------------------------------------- >>> > >>> > From: Boris Behrens <bb@xxxxxxxxx> >>> > Sent: Friday, November 5, 2021 4:22 PM >>> > To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>; >>> > ceph-users@xxxxxxx >>> > Subject: Re: large bucket index in multisite >>> > environement (how to deal with large omap objects warning)? >>> > >>> > Email received from the internet. If in doubt, don't click any >>> > link nor open any attachment ! >>> > ________________________________ >>> > Cheers Istvan, >>> > >>> > how do you do this? >>> > >>> > Am Do., 4. Nov. 2021 um 19:45 Uhr schrieb Szabo, Istvan (Agoda) < >>> > Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx>>: >>> > This one you need to prepare, you beed to preshard the bucket >>> > which you know that will hold more than millions of objects. >>> > >>> > I have a bucket where we store 1.2 billions of objects with 24xxx >>> shard. >>> > No omap issue. >>> > Istvan Szabo >>> > Senior Infrastructure Engineer >>> > --------------------------------------------------- >>> > Agoda Services Co., Ltd. >>> > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx> >>> > --------------------------------------------------- >>> > >>> > >>> > >>> > -- >>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal >>> > abweichend >>> im >>> > groüen Saal. >>> > _______________________________________________ >>> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send >>> > an email to ceph-users-leave@xxxxxxx >>> > >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an >>> email to ceph-users-leave@xxxxxxx >>> >> -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx