Hi Gregory,
Many thanks for your reply. I couldn't spot any resources that describe/show how you can successfully write / append to an EC pool with the librados API on those links. Do you know of any such examples or resources? Or is it just simply not possible?
Best regards,
On Thu, Oct 6, 2016 at 4:08 AM, James Norman <james@xxxxxxxxxxxxxxxxxxx> wrote:Hi there,
I am developing a web application that supports browsing, uploading, downloading, moving files in Ceph Rados pool. Internally to write objects we use rados_append, as it's often too memory intensive for us to have the full file in memory to do a rados_write_full.
We do not control our customer's Ceph installations, such as whether they use replicated pools, EC pools etc. We've found that when dealing with a EC pool, our rados_append calls return error code 95 and message "Operation not supported".
I've had several discussions with members in the IRC chatroom regarding this, and the general consensus I've got is: 1) Use write alignment. 2) Put a replicated pool in front of the EC pool 3) EC pools have a limited feature set
Regarding point 1), are there any actual code example for how you would handle this in the context of rados_append? I have struggled to find even one. This seems to me something that should be handled by either the API libraries, or Ceph itself, not the client trying to write some data.
librados requires a fair bit of knowledge from the user applications,yes. One thing you mention that sounds concerning is that you can'thold the objects in-memory — RADOS is not comfortable with very largeobjects and you'll find that things like backfill might not perform asyou expect. (At this point everything will *probably* function, but itmay be so slow as to make no difference to you when it hits thatsituation.) Certainly if your objects do not all fit neatly intobuckets of a particular size and you have some that are very large,you will have a very not-uniform balance.But, if you want to learn about EC pools there is some documentationat http://docs.ceph.com/docs/master/dev/osd_internals/erasure_coding/(or in ceph.git/doc/dev/osd_internals/erasure_coding) from when theywere being created. Regarding point 2) This seems to be a workaround, and generally not something we want to recommend to our customers. Is it detrimental to us an EC pool without a replicated pool? What are the performance costs of doing so?
Yeah, don't do that. Cache pools are really tricky to use properly andturned out not to perform very well. Regarding point 3) Can you point me towards resources that describe what features / abilities you lose by adopting an EC pool?
Same as above links, apparently. But really, you can read from andappend to them. There are no object classes, no arbitrary overwrites,no omaps.-Greg
|