Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Not replicated buckets you can reshard but better to confirm other people also.

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx
---------------------------------------------------

-----Original Message-----
From: Boris Behrens <bb@xxxxxxxxx> 
Sent: Monday, November 8, 2021 6:46 PM
To: Ceph Users <ceph-users@xxxxxxx>
Subject:  Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

Email received from the internet. If in doubt, don't click any link nor open any attachment !
________________________________

Maybe I missed it, but can't I just reshad buckets when they are not replicated / synced / mirrord (what is the correct ceph terminology in this)?

Am Mo., 8. Nov. 2021 um 12:28 Uhr schrieb mhnx <morphinwithyou@xxxxxxxxx>:

> (There should not be any issues using rgw for other buckets while
> re-sharding.)
> If it is then disabling the bucket access will work right? Also sync 
> should be disabled.
>
> Yes, after the manual reshard it should clear the leftovers but in my 
> situation resharding failed and I got double entries for that bucket.
> I didn't push further, instead I divide the bucket to new buckets and 
> reduce object count with a new bucket tree. Copied all of the objects 
> with rclone and started bucket remove "radosgw-admin bucket rm 
> --bucket=mybucket --bypass-gc --purge-objects 
> --max-concurrent-ios=128" it has been very very long time "started at 
> Sep08" and it is still working. There was 250M objects in that bucket 
> and after the manual reshard faiI I got 500M object count when I check 
> with bucket stats num_objects. Now I have;
> "size_kb": 10648067645,
> "num_objects": 132270190
>
> Remove speed is 50-60 objects in a second. It's not because of the 
> cluster speed. Cluster is fine.
> I have space so I let it go. When I see stable object count I will 
> stop the remove process and start again with the " 
> --inconsistent-index" parameter.
> I wonder is it safe to use the parameter with referenced objects? I 
> want to learn how "--inconsistent-index" works and what it does.
>
> Сергей Процун <prosergey07@xxxxxxxxx>, 5 Kas 2021 Cum, 17:46 tarihinde 
> şunu yazdı:
>
>> There should not be any issues using rgw for other buckets while 
>> re-sharding.
>>
>> As for doubling number of objects after reshard is an interesting 
>> situation. After the manual reshard is done, there might be leftover 
>> from the old bucket index. As during reshard new 
>> .dir.new_bucket_index objects are created. They contain all data 
>> related to the objects which are stored in buckets.data pool. Just 
>> wondering if the issue with the doubled number of objects was related 
>> to old bucket index. If so its save to delete old bucket index.
>>
>>  In the perfect world, it would be ideal to know the eventoal number 
>> of objects inside the bucket and set number of shards to the 
>> corresponding setting initially.
>>
>>  In the real world when the client re-purpose the usage of the 
>> bucket, we have to deal with reshards.
>>
>> пт, 5 лист. 2021, 14:43 користувач mhnx <morphinwithyou@xxxxxxxxx> пише:
>>
>>> I also use this method and I hate it.
>>>
>>> Stopping all of the RGW clients is never an option! It shouldn't be.
>>> Sharding is hell. I was have 250M objects in a bucket and reshard 
>>> failed after 2days and object count doubled somehow! 2 days of 
>>> downtime is not an option.
>>>
>>> I wonder if I stop the write-read on a bucket and while resharding 
>>> it is there any problem of using RGW's with all other buckets?
>>>
>>> Nowadays I advise splitting buckets as much as you can! That means 
>>> changing your apps directory tree but this design requires it.
>>> You need to plan object count at least for 5 years and create ones.
>>> Usually I use 101 shards which means 10.100.000 objects.
>>> Also If I need to use versioning I use 2x101 or 3x101 because 
>>> versions are hard to predict. You need to predict how many versions 
>>> you need and set a lifecycle even before using the bucket!
>>> The max shard that I use 1999. I'm not happy about it but sometimes 
>>> you gotta do what you need to do.
>>> Fighting with customers is not an option, you can only advise 
>>> changing their apps folder tree but I've never seen someone accept 
>>> the deal without arguing.
>>>
>>> My offers usually like this:
>>> 1- Core files bucket: no need to change or very limited changes.
>>> "calculate
>>> the object count and multiply with 2"
>>> 2- Hot data bucket: There will be daily changes and versioning.
>>> "calculate
>>> the object count and multiply with 3"
>>> 3- Cold data bucket[s]: There will be no daily changes. You should 
>>> open new buckets every Year or Month. This is good to keep it clean 
>>> and steady. No need for versioning and Multisite Will not suffer due 
>>> to barely changes.
>>> 4- Temp files bucket[s]: This is so important. If you're crawling 
>>> millions of millions objects everyday and delete it at the end of 
>>> the week or month then you should definitely use a temp bucket.  No 
>>> versioning, No multisite, No index if it's possible.
>>>
>>>
>>>
>>> Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>, 5 Kas 2021 Cum, 
>>> 12:30 tarihinde şunu yazdı:
>>>
>>> > You mean prepare or reshard?
>>> > Prepare:
>>> > I collect as much information for the users before onboarding so I 
>>> > can prepare for their use case in the future and set things up.
>>> >
>>> > Preshard:
>>> > After created the bucket:
>>> > radosgw-admin bucket reshard --bucket=ex-bucket --num-shards=101
>>> >
>>> > Also when you shard the buckets, you need to use prime numbers.
>>> >
>>> > Istvan Szabo
>>> > Senior Infrastructure Engineer
>>> > ---------------------------------------------------
>>> > Agoda Services Co., Ltd.
>>> > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
>>> > ---------------------------------------------------
>>> >
>>> > From: Boris Behrens <bb@xxxxxxxxx>
>>> > Sent: Friday, November 5, 2021 4:22 PM
>>> > To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>; 
>>> > ceph-users@xxxxxxx
>>> > Subject: Re:  large bucket index in multisite 
>>> > environement (how to deal with large omap objects warning)?
>>> >
>>> > Email received from the internet. If in doubt, don't click any 
>>> > link nor open any attachment !
>>> > ________________________________
>>> > Cheers Istvan,
>>> >
>>> > how do you do this?
>>> >
>>> > Am Do., 4. Nov. 2021 um 19:45 Uhr schrieb Szabo, Istvan (Agoda) <
>>> > Istvan.Szabo@xxxxxxxxx<mailto:Istvan.Szabo@xxxxxxxxx>>:
>>> > This one you need to prepare, you beed to preshard the bucket 
>>> > which you know that will hold more than millions of objects.
>>> >
>>> > I have a bucket where we store 1.2 billions of objects with 24xxx
>>> shard.
>>> > No omap issue.
>>> > Istvan Szabo
>>> > Senior Infrastructure Engineer
>>> > ---------------------------------------------------
>>> > Agoda Services Co., Ltd.
>>> > e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
>>> > ---------------------------------------------------
>>> >
>>> >
>>> >
>>> > --
>>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal 
>>> > abweichend
>>> im
>>> > groüen Saal.
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send 
>>> > an email to ceph-users-leave@xxxxxxx
>>> >
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an 
>>> email to ceph-users-leave@xxxxxxx
>>>
>>

--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux