Re: Appending to an erasure coded pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



pool_requires_alignment can get pool's stripe_width, and you need
write multiple of that size in each append.
stripe_width can be configured with
osd_pool_erasure_code_stripe_width, but the actual size will be
adjusted by ec plugin

2016-10-17 18:34 GMT+08:00 James Norman <james@xxxxxxxxxxxxxxxxxxx>:
> Hi Gregory,
>
> Many thanks for your reply. I couldn't spot any resources that describe/show
> how you can successfully write / append to an EC pool with the librados API
> on those links. Do you know of any such examples or resources? Or is it just
> simply not possible?
>
> Best regards,
>
> James Norman
>
> On 6 Oct 2016, at 19:17, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>
> On Thu, Oct 6, 2016 at 4:08 AM, James Norman <james@xxxxxxxxxxxxxxxxxxx>
> wrote:
>
> Hi there,
>
> I am developing a web application that supports browsing, uploading,
> downloading, moving files in Ceph Rados pool. Internally to write objects we
> use rados_append, as it's often too memory intensive for us to have the full
> file in memory to do a rados_write_full.
>
> We do not control our customer's Ceph installations, such as whether they
> use replicated pools, EC pools etc. We've found that when dealing with a EC
> pool, our rados_append calls return error code 95 and message "Operation not
> supported".
>
> I've had several discussions with members in the IRC chatroom regarding
> this, and the general consensus I've got is:
> 1) Use write alignment.
> 2) Put a replicated pool in front of the EC pool
> 3) EC pools have a limited feature set
>
> Regarding point 1), are there any actual code example for how you would
> handle this in the context of rados_append? I have struggled to find even
> one. This seems to me something that should be handled by either the API
> libraries, or Ceph itself, not the client trying to write some data.
>
>
> librados requires a fair bit of knowledge from the user applications,
> yes. One thing you mention that sounds concerning is that you can't
> hold the objects in-memory — RADOS is not comfortable with very large
> objects and you'll find that things like backfill might not perform as
> you expect. (At this point everything will *probably* function, but it
> may be so slow as to make no difference to you when it hits that
> situation.) Certainly if your objects do not all fit neatly into
> buckets of a particular size and you have some that are very large,
> you will have a very not-uniform balance.
>
> But, if you want to learn about EC pools there is some documentation
> at http://docs.ceph.com/docs/master/dev/osd_internals/erasure_coding/
> (or in ceph.git/doc/dev/osd_internals/erasure_coding) from when they
> were being created.
>
>
> Regarding point 2) This seems to be a workaround, and generally not
> something we want to recommend to our customers. Is it detrimental to us an
> EC pool without a replicated pool? What are the performance costs of doing
> so?
>
>
> Yeah, don't do that. Cache pools are really tricky to use properly and
> turned out not to perform very well.
>
>
> Regarding point 3) Can you point me towards resources that describe what
> features / abilities you lose by adopting an EC pool?
>
>
> Same as above links, apparently. But really, you can read from and
> append to them. There are no object classes, no arbitrary overwrites,
> no omaps.
> -Greg
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux