On 06/18/2013 04:22 PM, James Plank wrote: > Hi all -- thank you for including me on this thread, although I have little substantive to add. At the moment, my sole focus is finishing a journal paper about GF implementations, with a concomitant GF-complete release to accompany it. I agree that the CPU burden of the GF arithmetic will not be a bottleneck in your system, regardless of which implementation you use, as long as you stay at or below GF(2^16). If you want to go higher, GF-complete will help. When we put out a new release (the code will be ready within two weeks, however, the documentation is lagging), I'll let you know. I think LRC is a nice coding paradigm, although I imagine that it has IP issues with Microsoft. I don't have first-hand experience with network/regenerating codes, and I'll be honest -- there have been so many papers in that realm that I am not up to date on them. > > Is there a question on which you'd like some help? It sounds as though you are at two decision points: What code should you use, and at which point on the space-overhead/fault-tolerance curve would you like to be? Hi James, Unless someone objects it looks like Ceph going to use jerasure-1.2 with reed-solomon. I'm glad to hear that GF arithmetic will not be a bottleneck : we're going to stay below GF(2^8). However minimizing the CPU footprint is essential and I'm looking forward to use the next version including the SIMD optimizations that you demonstrated in gf-complete. I wrote down a short description of the read/write path I plan to implement in ceph : https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst . A quick look at the drawings will hopefully give you an idea. Each OSD is a disk connected to the others over the network. Although I chose K+M = 5 I suspect the most common use case will be around K+M = 7+3 = 10 I've seen that jerasure-1.2 not only provides classic reed-solomon but also cauchy reed-solomon and liberation / minimal density MDS codes. I assume classic reed-solomon is best suited for the default Ceph use case described above but I'm not sure. What do you think ? Thanks a lot for your advices :-) It helps me write sensible code. Cheers > > Best wishes, > > Jim > ---------- > > On Jun 18, 2013, at 3:44 AM, Benoît Parrein wrote: > >> Hi Paul, >> >> thank you for your message >> >> from my point, LRC focuses on the repairing problem. how to reconstruct destroyed node to maintain the same availability by the distributed system? >> in this context they can even go below 1x rate by introducing local parity on classical Reed Solomon blocks (but they pay a supplementary overhead). see excellent Alex Dimakis's papers for that. but, still from my point, the same relationship between redundancy and availability occurs (if you consider binomial model for your loses). >> >> best >> bp >> >> >> Le 17/06/2013 18:55, Paul Von-Stamwitz a écrit : >>> Loic, >>> >>> As Benoit points out, Mojette uses discrete geometry rather than algebra, so simple XOR is all that is needed. >>> >>> Benoit, >>> >>> Microsoft's paper states that their [12,2,2] LRC provides better availability than 3x replication with 1.33x efficiency. 1.5x is certainly a good number. I'm just pointing out that better efficiency can be had without losing availibity. >>> >>> All the best, >>> Paul >> >> <benoit_parrein.vcf> > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Loïc Dachary, Artisan Logiciel Libre All that is necessary for the triumph of evil is that good people do nothing.
Attachment:
signature.asc
Description: OpenPGP digital signature