Re: determine the size

Waed Bataineh <promiselady90@xxxxxxxxx> · Sat, 13 Apr 2013 17:17:19 +0300

John, 
Things seem more clear now, thank you.

Dont know if it's already reveal, or i can ask, I got the fact that it depend on the object name in the first place to determine which pg will refer to and then which osd.

what i know that by getting the object name Ceph retrieve let's say a value and then do hash (value) will call it "oid" after that oid %numOfpg will get the pg-id, after that the CRUSH algo will start to determine the osd that will be affected.

> so can i get how Ceph maintain to calculate the pg id from object name, the precise equation or formula ?
> can i retrieve the hash table that contain all the pgs that i have and the objects too?

Hope I'm making my self clear! 

Best regards,
Thank you  

On Fri, Apr 12, 2013 at 9:03 PM, John Wilkins <john.wilkins@xxxxxxxxxxx> wrote:

Waed, 
I've seen you returning to this line of questioning several times, and I assume the reason is that you are still trying to understand the reasons for computing object locations rather than having them statically defined. Ceph calculates object placement and distributes data randomly. This has numerous benefits, but the benefits aren't obvious at first:

1. No Bottleneck or Single Point of Failure: Instead of contacting a centralized broker to look up and retrieve an object, a Ceph client calculates where the object should be (based on the cluster map and the CRUSH algorithm) and contacts the OSD directly. This eliminates a single point of failure and a performance bottleneck under heavy loads. It's conceivable to have some sort of round robin algorithm build a data allocation table, but this requires contacting the server each time you want to retrieve an object to determine the object location--resulting in chatty sessions. It also makes dynamic rebalancing a bit of a nightmare.

2. Load Distribution: By distributing data randomly across the cluster, load spikes on one OSD generally don't occur. In the scenario you described, in actual fact you may get better performance by having the objects on separate OSDs depending on their size. The time to establish a connection is certainly a factor, but total throughput of the disk and the network card are also considerations. If you want to read or write two large objects simultaneously, you would likely get better performance by having them on separate OSDs/hosts. This is because the total throughput of the disk and sequential read and write throughput is usually the bottleneck.

3. Dynamic Rebalancing: By computing where to store an object, Ceph can dynamically rebalance the cluster. Clients don't need to "know" where the object is in order to retrieve it. What they need to know is the current state of the cluster, which they retrieve from the monitor. Calculating an object location is much faster than a look-up, so it doesn't involve a performance penalty. The client and server don't need to be in sync with respect to object locations either. They only need to be in sync in terms of the current state of the cluster.

You might also want to have a look at these sections of the documentation: 

http://ceph.com/docs/master/install/hardware-recommendations/#data-storage

http://ceph.com/docs/master/rados/operations/crush-map/
http://ceph.com/docs/master/architecture/#how-ceph-scales

On Wed, Apr 10, 2013 at 1:56 PM, Waed Bataineh <promiselady90@xxxxxxxxx> wrote:

On Wednesday, April 10, 2013, Gregory Farnum <greg@xxxxxxxxxxx> wrote:

> On Wednesday, April 10, 2013 at 2:53 AM, Waed Bataineh wrote:

>> Hello,

>>

>> I have several question i'll be appreciated if i got answers for them:

>>

>> 1. does the osd have a fixed size or it compatible with the machine

>> i'm working with.

>

> You can weight OSDs to account for different capacities or speeds; is

that what you're asking?

>I know about the weight let's say atribute. Whut i meant that can we think

of the osds as memory space! which will take certain size from the machine

RAM.

>> if the next case is true what is the equation?

>>

>> 2. i can list the whole objects that in a certain pool, but can we

>> determine the objects in specific osd, as a commandline i mean?

>

> Not really; no. You could construct the information by listing al the

objects in each pool and calculating if they live on the OSD in question,

but there is not an interface to have an OSD go list its contents.

>Ok.

>> 3. Finally, does it differ if i'm reeding two objects from the same

>> osd than reading two objects from two osds?

>

> Differ how? There is not special handling for multiple reads from the

same OSD, if that's what you're asking.

> i meant in time factor. Take the following sinario if we notice that

certain client always or let say more frequent need to open to objects n

these two were in different osd would it better to make those objects be in

the same osd?

> -Greg

> Software Engineer #42 @ http://inktank.com | http://ceph.com

>

>

>

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
John Wilkins
Senior Technical Writer
Intank
john.wilkins@xxxxxxxxxxx

(415) 425-9599

http://inktank.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com