Re: [Suspicious newsletter] Re: Unable to reshard bucket

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eric,

Thank you to pickup my question.
Correct me if I'm wrong please regarding sharding and indexes.
The flow when the user put an object to the cluster, it will create 1 object in the index pool that will hold the let's say location of the file in the data pool.
1 index entry is for 1 bucket so if the bucket objects number is growing the index object will grow too.
Here is where sharding come into picture, with sharding we can make smaller chunks of this 1 big index object. Document says we can calculate the shard numbers with 100.000, so 1 shard is for 100.000 objects which means if the bucket has 100 shards, it can hold let's say 10 millions objects.

Now I have the situation, there is 100 shards and 100.000 objects/shard set. Have a bucket which crossed the 10 millions of objects and to be honest I don't know what is happening at the moment, they are at 11.5 millions objects, no issue, I just don't understand what is happening.

So if we don't know at the beginning of the bucket creation what is the planned number of objects in the future, it's better to set the sharding to a high number. And as the documentation says, 64k is the max shards bucket, so why not set this number to avoid any limitation.

And now we have a new cluster with multisite enabled, here dynamic bucket sharding is not even possible, so I don't know at the moment, what I should set as a basic before put it into production.

Thank you in advance your clarification.


-----Original Message-----
From: Eric Ivancich <ivancich@xxxxxxxxxx>
Sent: Wednesday, November 25, 2020 5:37 AM
To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>
Cc: ceph-users <ceph-users@xxxxxxx>
Subject: Re: [Suspicious newsletter]  Re: Unable to reshard bucket

Email received from outside the company. If in doubt don't click links nor open attachments!
________________________________

Can you clarify, Istvan, what you plan on setting to 64K? If it’s the number of shards for a bucket, that would be a mistake.

> On Nov 21, 2020, at 2:09 AM, Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx> wrote:
>
> Seems like this sharding we need to be plan carefully since the beginning. I'm thinking to set the shard number by default to the maximum which is 64k and leave it as is so we will never reach the limit only if we reach the maximum number of objects.
>
> Would be interesting to know what is the side effect if I set the shards to 64k by default.
>
> Istvan Szabo
> Senior Infrastructure Engineer

--
J. Eric Ivancich
he / him / his
Red Hat Storage
Ann Arbor, Michigan, USA


________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux