Hi, I would like to use ceph to store a lot of small objects. Our current usage pattern is 4.5 billion unique objects, ranging from 0 to 100MB, with a median size of 3-4kB. Overall, that's around 350 TB of raw data to store, which isn't much, but that's across a *lot* of tiny files. We expect a growth pattern of around at third per year, and the object size distribution to sensibly stay the same (it's been stable for the past three years, and we don't see that changing). Our object access pattern is a very simple key -> value store, where the key happens to be the sha1 of the content we're storing. Any metadata are stored externally and we really only need a dumb object storage. Our redundancy requirement is to be able to withstand the loss of 2 OSDs. After looking at our options for storage in Ceph, I dismissed (perhaps hastily) RGW for its metadata overhead, and went straight to plain RADOS. I've setup an erasure coded storage pool, with default settings, with k=5 and m=2 (expecting a 40% increase in storage use over plain contents). After storing objects in the pool, I see a storage usage of 700% instead of 140%. My understanding of the erasure code profile docs[1] is that objects that are below the stripe width (k * stripe_unit, which in my case is 20KB) can't be chunked for erasure coding, which makes RADOS fall back to plain object copying, with k+m copies. [1] http://docs.ceph.com/docs/master/rados/operations/erasure-code-profile/ Is my understanding correct? Does anyone have experience with this kind of storage workload in Ceph? If my understanding is correct, I'll end up adding size tiering on my object storage layer, shuffling objects in two pools with different settings according to their size. That's not too bad, but I'd like to make sure I'm not completely misunderstanding something. Thanks! -- Nicolas Dandrimont Backend Engineer, Software Heritage BOFH excuse #170: popper unable to process jumbo kernel
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com