Re: erasure code and coefficients

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Loic,
Dimakis (one of the authors of xorbas) is talking about coefficients
because they want to find a way to reduce the storage overhead used
with LRC. In the simple case used in Fig. 2, a RS (k=10, m=4) has
14/10 storage overhead but when using LRC, the overhead increases to
17/10 because you also need to store s1, s2 and s3. Basically, the
idea is to find specific coefficients c1..c10 that permit to obtain s3
through s1 and s2. In other words, get some s1 and s2 that when xored
together give s3. If you find such coefficients, you don't need to
store s3 and the storage overhead of LRC is 1.6x instead of 1.7x.

Dimakis said that for the Reed Solomon implementation used in HDFS
RAID they can simple set all coefficients with value '1' and use xor.

This cannot be the case of the Reed Solomon implemented by you (I
understood is the jerasure library by Plank) but that I am not sure. I
guess we need the help of a mathematician or at least check and
compare both implementations.

Finally, apparently for xorbas they only implemented the configuration
RS(10,4) and not other combinations. Unfortunately, the wiki page of
the project is empty http://wiki.apache.org/hadoop/ErasureCode and the
main page says 'erasure coding under development'.

I recommend you to watch the xorbas presentation video
http://smahesh.com/HadoopUSC/ (a very clear explanation of xorbas) and
use the Dimakis wiki page to check the large collection of paper they
have: http://storagewiki.ece.utexas.edu/

Best,

koleosfuscus

________________________________________________________________
"My reply is: the software has no known bugs, therefore it has not
been updated."
Wietse Venema


On Sun, Jun 29, 2014 at 11:30 AM, Loic Dachary <loic@xxxxxxxxxxx> wrote:
> Hi Andreas,
>
> In http://anrg.usc.edu/~maheswaran/Xorbas.pdf I get the idea of computing local coding chunks the way it is implemented in https://github.com/ceph/ceph/pull/1921 (i.e. delegating encoding / decoding to other plugins). However, there are theoretical aspects of the paper that I do not understand and I'm hoping you can shed some light on it. In particular, I don't know what "coefficients" are about. For instance in the context of Figure 2 caption : "The main theoretical challenge is to choose the coeffi cients c(i) to maximize the fault tolerance of the code."
>
> Would you recommend a paper to read to better understand this ? Also I'd like to understand what "coefficients" mean in the context of jerasure or if they do not apply.
>
> Thanks for you help :-)
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux