Hi! * Gregory Farnum <gfarnum@xxxxxxxxxx> [2018-06-28 19:31:09 -0700]: > That’s close but not *quite* right. It’s not that Ceph will explicitly > “fall back” to replication. In most (though perhaps not all) erasure codes, > what you’ll see is full sized parity blocks, a full store of the data (in > the default reed-Solomon that will just be full-sized chunks up to however > many are needed to store it fully in a single copy), and the remaining data > chunks (out of the k) will have no data. *But* Ceph will keep the “object > info” metadata in each shard, so all the OSDs in a PG will still witness > all the writes. That makes sense. To make sure this is what's actually happening, and combining this insight with the info Paul Emmerich gave about bluestore min alloc size, I've done an analysis of the number of stripes my objects would take, and how much space usage that would incur. So far, I've loaded 92.4 million objects, and I've done the analysis over (a random sample of) 10 million objects. Counting the number of 5*64k stripes each object would take, and multiplying that by 1.4, I get a space usage estimation that's within a percent of the actual usage, so, yeah, seems that you're both right. Ouch :-) > > If my understanding is correct, I'll end up adding size tiering on my > > object > > storage layer, shuffling objects in two pools with different settings > > according > > to their size. That's not too bad, but I'd like to make sure I'm not > > completely > > misunderstanding something. > > > > That’s probably a reasonable response, especially if you are already > maintaining an index for other purposes! I guess that's back to the drawing board for us, because the 64k minimal allocation will also happen on basic replicated pools, and we can store a _lot_ of objects in a 64k block ;) Thanks for your insights, -- Nicolas Dandrimont BOFH excuse #259: Someone's tie is caught in the printer, and if anything else gets printed, he'll be in it too.
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com