Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

prosergey07 <prosergey07@xxxxxxxxx> · Tue, 09 Nov 2021 01:11:05 +0200



Theoretically you should be able to reshard buckets which are not in sync. That would produce new .dir.new_bucket_index objects inside your bucket.index pool which would put omap key/values into new shards (.dir.new_bucket_index). Objects itself would be left intact as marker id is not changed. Every object within rados has marker id as the prefix in the objet's name. After reshard old bucket.index objects are supposed to be deleted but it does not always happen and you need to remove them manually in certain cases. So you can check the old bucket id by running:radosgw-admin metadata get "bucket:BUCKET_NAME" Then get the number of shards your bucket has. You might need them in the future.radosgw-admin metadata get "bucket.instance:BUCKET_NAME:BUCKET_ID"Then you can run the reshard. After the reshard is finished you can verify if old bucket index objects exist:rados -p bucket.index ls | grep OLD_BUCKET_INDEX If old shards are found you might need to remove them. Also after sync need to check for old shards in secondary sites.Надіслано з пристрою Galaxy
-------- Оригінальне повідомлення --------Від: "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx> Дата: 08.11.21  13:48  (GMT+02:00) Кому: Boris Behrens <bb@xxxxxxxxx>, Ceph Users <ceph-users@xxxxxxx> Тема:  Re: large bucket index in multisite environement (how to deal with large omap objects warning)? Not replicated buckets you can reshard but better to confirm other people also.Istvan SzaboSenior Infrastructure Engineer---------------------------------------------------Agoda Services Co., Ltd.e: istvan.szabo@xxxxxxxxx--------------------------------------------------------Original Message-----From: Boris Behrens <bb@xxxxxxxxx> Sent: Monday, November 8, 2021 6:46 PMTo: Ceph Users <ceph-users@xxxxxxx>Subject: [ceph-users] Re: large bucket index in multisite environement (how to deal with large omap objects warning)?Email received from the internet. If in doubt, don't click any link nor open any attachment !________________________________Maybe I missed it, but can't I just reshad buckets when they are not replicated / synced / mirrord (what is the correct ceph terminology in this)?Am Mo., 8. Nov. 2021 um 12:28 Uhr schrieb mhnx <morphinwithyou@xxxxxxxxx>:> (There should not be any issues using rgw for other buckets while> re-sharding.)> If it is then disabling the bucket access will work right? Also sync > should be disabled.>> Yes, after the manual reshard it should clear the leftovers but in my > situation resharding failed and I got double entries for that bucket.> I didn't push further, instead I divide the bucket to new buckets and > reduce object count with a new bucket tree. Copied all of the objects > with rclone and started bucket remove "radosgw-admin bucket rm > --bucket=mybucket --bypass-gc --purge-objects > --max-concurrent-ios=128" it has been very very long time "started at > Sep08" and it is still working. There was 250M objects in that bucket > and after the manual reshard faiI I got 500M object count when I check > with bucket stats num_objects. No
w I have;> "size_kb": 10648067645,> "num_objects": 132270190>> Remove speed is 50-60 objects in a second. It's not because of the > cluster speed. Cluster is fine.> I have space so I let it go. When I see stable object count I will > stop the remove process and start again with the " > --inconsistent-index" parameter.> I wonder is it safe to use the parameter with referenced objects? I > want to learn how "--inconsistent-index" works and what it does.>> Сергей Процун <prosergey07@xxxxxxxxx>, 5 Kas 2021 Cum, 17:46 tarihinde > şunu yazdı:>>> There should not be any issues using rgw for other buckets while >> re-sharding.>>>> As for doubling number of objects after reshard is an interesting >> situation. After the manual reshard is done, there might be leftover >> from the old bucket index. As during reshard new >> .dir.new_bucket_index objects are created. They contain all data >> related to the objects which are stored in buckets.data pool. Just >> wondering if the issue with the doubled number of objects was related >> to old bucket index. If so its save to delete old bucket index.>>>>  In the perfect world, it would be ideal to know the eventoal number >> of objects inside the bucket and set number of shards to the >> corresponding setting initially.>>>>  In the real world when the client re-purpose the usage of the >> bucket, we have to deal with reshards.>>>> пт, 5 лист. 2021, 14:43 користувач mhnx <morphinwithyou@xxxxxxxxx> пише:>>>>> I also use this method and I hate it.>>>>>> Stopping all of the RGW clients is never an option! It shouldn't be.>>> Sharding is hell. I was have 250M objects in a bucket and reshard >>> failed after 2days and object count doubled somehow! 2 days of >>> downtime is not an option.>>>>>> I wonder if I stop the write-read on a bucket and while resharding >>> it is there any problem of using RGW's with all other buckets?>>>>>> Nowadays I advise splitting buckets as much as you can! That means >>> changing your apps directory tree but this design re
quires it.>>> You need to plan object count at least for 5 years and create ones.>>> Usually I use 101 shards which means 10.100.000 objects.>>> Also If I need to use versioning I use 2x101 or 3x101 because >>> versions are hard to predict. You need to predict how many versions >>> you need and set a lifecycle even before using the bucket!>>> The max shard that I use 1999. I'm not happy about it but sometimes >>> you gotta do what you need to do.>>> Fighting with customers is not an option, you can only advise >>> changing their apps folder tree but I've never seen someone accept >>> the deal without arguing.>>>>>> My offers usually like this:>>> 1- Core files bucket: no need to change or very limited changes.>>> "calculate>>> the object count and multiply with 2">>> 2- Hot data bucket: There will be daily changes and versioning.>>> "calculate>>> the object count and multiply with 3">>> 3- Cold data bucket[s]: There will be no daily changes. You should >>> open new buckets every Year or Month. This is good to keep it clean >>> and steady. No need for versioning and Multisite Will not suffer due >>> to barely changes.>>> 4- Temp files bucket[s]: This is so important. If you're crawling >>> millions of millions objects everyday and delete it at the end of >>> the week or month then you should definitely use a temp bucket.  No >>> versioning, No multisite, No index if it's possible.>>>>>>>>>>>> Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>, 5 Kas 2021 Cum, >>> 12:30 tarihinde şunu yazdı:>>>>>> > You mean prepare or reshard?>>> > Prepare:>>> > I collect as much information for the users before onboarding so I >>> > can prepare for their use case in the future and set things up.>>> >>>> > Preshard:>>> > After created the bucket:>>> > radosgw-admin bucket reshard --bucket=ex-bucket --num-shards=101>>> >>>> > Also when you shard the buckets, you need to use prime numbers.>>> >>>> > Istvan Szabo>>> > Senior Infrastructure Engineer>>> > --------------------------------------------------->>> > Agoda Services Co., Ltd.>
>> > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>>>> > --------------------------------------------------->>> >>>> > From: Boris Behrens <bb@xxxxxxxxx>>>> > Sent: Friday, November 5, 2021 4:22 PM>>> > To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>; >>> > ceph-users@xxxxxxx>>> > Subject: Re:  large bucket index in multisite >>> > environement (how to deal with large omap objects warning)?>>> >>>> > Email received from the internet. If in doubt, don't click any >>> > link nor open any attachment !>>> > ________________________________>>> > Cheers Istvan,>>> >>>> > how do you do this?>>> >>>> > Am Do., 4. Nov. 2021 um 19:45 Uhr schrieb Szabo, Istvan (Agoda) <>>> > Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx>>:>>> > This one you need to prepare, you beed to preshard the bucket >>> > which you know that will hold more than millions of objects.>>> >>>> > I have a bucket where we store 1.2 billions of objects with 24xxx>>> shard.>>> > No omap issue.>>> > Istvan Szabo>>> > Senior Infrastructure Engineer>>> > --------------------------------------------------->>> > Agoda Services Co., Ltd.>>> > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>>>> > --------------------------------------------------->>> >>>> >>>> >>>> > -->>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal >>> > abweichend>>> im>>> > groÃƒ¼en Saal.>>> > _______________________________________________>>> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send >>> > an email to ceph-users-leave@xxxxxxx>>> >>>> _______________________________________________>>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an >>> email to ceph-users-leave@xxxxxxx>>>>>--Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal._______________________________________________ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@ceph.io_______________________________________________ceph-users mailing list -- ceph-user
s@xxxxxxxxx unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx