Re: Upcoming Erasure coding

Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx> · Wed, 25 Dec 2013 11:16:45 +1300

On 25/12/13 04:33, Loic Dachary wrote:

On 24/12/2013 10:22, Wido den Hollander wrote:

IIRC Erasure Encoding doesn't work well with RBD, if it even works at all due to the fact that you can't update a object, but you have to completely rewrite the whole object.

So Erasure encoding works great with the RADOS Gateway, but it doesn't with RBD or CephFS.

When using Erasure you should also be aware that recovery traffic can be 10x the traffic of the traffic you would see with a replicated pool.

Wido

P.S.: Loic, please correct me if I'm wrong :)

You are correct : erasure code pools will not support all operations at first. They will be suitable for use with the tiering scenario I described. And most probably with the majority of operations done by radosgw. But the lack of support for partial writes makes it impossible to use it as an RBD pool.

That raises an interesting question : what would be the benefit of having an erasure coded RBD pool instead of a replica RBD pool with an erasure coded second tier ? In other words, is there a compelling reason to want:

RBD => erasure coded pool

instead of

RBD => replica pool => erasure code pool

where the objects are automatically moved to the erasure code pool if they are not used for more than X days.

I may have misunderstood this - but the re-write of entire object is at 
the RADOS level right? So would be a rewrite of (say) an entire 4M chunk 
of an RBD image if any part of that chunk needs a change.

If so, it seems to me that such a design could still be workable for 
write once, read lots (and maybe delete) workloads - e.g data 
loading/analysis etc.

Regards

Mark
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com