Re: Object Map Costs (Was: Snapshot Costs (Was: Re: Pool Sizes))

Kent Borg <kentborg@xxxxxxxx> · Thu, 9 Mar 2017 10:51:25 -0500

On 03/08/2017 05:07 PM, Gregory Farnum wrote:
How about iterating through a whole set of values vs. reading a RADOS object
holding the same amount of data?
"Iterating"?

As in rados_read_op_omap_get_vals(), "Start iterating over key/value 
pairs on an object."

In general, you should use the format that is appropriate for the data
and usage pattern rather than worrying about performance — they are
optimized for the interfaces we expose! ;)

But looks can deceive. For example, your API exposes a call to find out 
how may objects are in a pool. But, experimentally I discover it can be 
both low and high. Once I understand (better) how it is implemented, I 
can see why I should not use that for more than an estimate.

Or, silly me, I saw an interface that exposes creation and deletion of 
pools: Don't do that! (Well, hardly ever.)

Understanding how it works under the hood makes these things much clearer.

Another example, omap values vs. xattrs: they are an odd set of 
siblings, but they make much more sense once one knows the 
implementation differences.

(I am guessing that an xattr read or write--to an XFS OSD--would be 
faster than an omap read or write. Unless the xattr overflows in size to 
become a LevelDB transaction. Right? Also, I can imagine xattrs 
deprecating once Bluestore settles in and starts to get comfortable.)

Ceph is like some strange predator that can swallow beasts far, far 
bigger than it. (EMC and Netapp and...?) Us folk out here, programming 
at the RADOS layer (though I am starting to think maybe there are very 
few of me), need to understand which parts are dang so stretchy and 
which ones are not. The dynamic range between what is suitable to put in 
a single xattr and petabytes of a cluster is considerable. There is a 
lot of room to scale things wrong.

But I am getting the hang of it. Slowly.

Thanks,

-kb
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com