Re: CEPH Erasure Encoding + OSD Scalability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andreas,

On 12/11/2013 02:11, Andreas Joachim Peters wrote:
> Hi Loic,
> 
> I am finally doing the benchmark tool and I found a bunch of wrong parameter checks which can make the whole thing SEGV.
> 
> All the RAID-6 codes have restrictions on the parameters but they are not correctly enforced for Liberation & Blaum-Roth codes in the CEPH wrapper class ... see text from PDF
> 
> "Minimal Density RAID-6 codes are MDS codes based on binary matrices which satisfy a lower-bound on the number  of non-zero entries. Unlike Cauchy coding, the bit-matrix elements do not correspond to elements in GF (2 w ). Instead, the bit-matrix itself has the proper MDS property. Minimal Density RAID-6 codes perform faster than Reed-Solomon and Cauchy Reed-Solomon codes for the same parameters. Liberation coding, Liber8tion coding, and Blaum-Roth coding are three examples of this kind of coding that are supported in jerasure.
> 
> With each of these codes, m must be equal to two and k must be less than or equal to w. The value of w has restrictions based on the code:
> 
> • With Liberation coding, w must be a prime number [Pla08b].
> • With Blaum-Roth coding, w + 1 must be a prime number [BR99]. • With Liber8tion coding, w must equal 8 [Pla08a].
> 
> ...
> 
> Do you add this fixes?

Nice catch. I created and assigned to myself : http://tracker.ceph.com/issues/6754
> 
> For the benchmark suite it runs currently 308 different configurations for the 2 algorithm which make sense from the performance point of view and provides this output:
> 
> 
> # -----------------------------------------------------------------
> # Erasure Coding Benchmark - (C) CERN 2013 - Andreas.Joachim.Peters@xxxxxxx
> # Ram-Size=12614856704 Allocation-Size=100000000
> # -----------------------------------------------------------------
> # [ -BENCH- ] [       ] technique=memcpy                                                            speed=5.408 [GB/s] latency=9.245 ms
> # [ -BENCH- ] [       ] technique=d=a^b^c-xor                                                       speed=4.377 [GB/s] latency=17.136 ms
> # [ -BENCH- ] [001/304] technique=cauchy_good:k=05:m=2:w=8:lp=0:packet=00064:size=50000000          speed=1.308 [GB/s] latency=038	[ms] size-overhead=40	[%]
> ..
> ..
> # [ -BENCH- ] [304/304] technique=liberation:k=24:m=2:w=29:lp=2:packet=65536:size=50000000          speed=0.083 [GB/s] latency=604	[ms] size-overhead=16	[%]
> # -----------------------------------------------------------------
> # Erasure Code Performance Summary::
> # -----------------------------------------------------------------
> # RAM:                   12.61 GB
> # Allocation-Size        0.10 GB
> # -----------------------------------------------------------------
> # Byte Initialization:   29.35 MB/s
> # Memcpy:                5.41 GB/s
> # Triple-XOR:            4.38 GB/s
> # -----------------------------------------------------------------
> # Fastest RAID6          2.72 GB/s liber8tion:k=06:m=2:w=8:lp=0:packet=04096:size=50000000
> # Fastest Triple Failure 0.96 GB/s cauchy_good:k=06:m=3:w=8:lp=0:packet=04096:size=50000000
> # Fastest Quadr. Failure 0.66 GB/s cauchy_good:k=06:m=4:w=8:lp=0:packet=04096:size=50000000
> # -----------------------------------------------------------------
> # .................................................................
> # Top 1  RAID6          2.72 GB/s liber8tion:k=06:m=2:w=8:lp=0:packet=04096:size=50000000
> # Top 2  RAID6          2.72 GB/s liber8tion:k=06:m=2:w=8:lp=0:packet=16384:size=50000000
> # Top 3  RAID6          2.64 GB/s liber8tion:k=06:m=2:w=8:lp=0:packet=65536:size=50000000
> # Top 4  RAID6          2.60 GB/s liberation:k=07:m=2:w=7:lp=0:packet=16384:size=50000000
> # Top 5  RAID6          2.59 GB/s liberation:k=05:m=2:w=7:lp=0:packet=04096:size=50000000
> # .................................................................
> # Top 1  Triple         0.96 GB/s cauchy_good:k=06:m=3:w=8:lp=0:packet=04096:size=50000000
> # Top 2  Triple         0.94 GB/s cauchy_good:k=06:m=3:w=8:lp=0:packet=16384:size=50000000
> # Top 3  Triple         0.93 GB/s cauchy_good:k=06:m=3:w=8:lp=0:packet=65536:size=50000000
> # Top 4  Triple         0.89 GB/s cauchy_good:k=07:m=3:w=8:lp=0:packet=04096:size=50000000
> # Top 5  Triple         0.87 GB/s cauchy_good:k=05:m=3:w=8:lp=0:packet=04096:size=50000000
> # .................................................................
> # Top 1  Quadr.         0.66 GB/s cauchy_good:k=06:m=4:w=8:lp=0:packet=04096:size=50000000
> # Top 2  Quadr.         0.65 GB/s cauchy_good:k=07:m=4:w=8:lp=0:packet=04096:size=50000000
> # Top 3  Quadr.         0.64 GB/s cauchy_good:k=06:m=4:w=8:lp=0:packet=16384:size=50000000
> # Top 4  Quadr.         0.64 GB/s cauchy_good:k=05:m=4:w=8:lp=0:packet=04096:size=50000000
> # Top 5  Quadr.         0.64 GB/s cauchy_good:k=06:m=4:w=8:lp=0:packet=65536:size=50000000
> # .................................................................
> 
> It takes around 30 second on my box. 


That looks great :-) If I understand correctly, it means https://github.com/ceph/ceph/pull/740 will no longer have benchmarks as they are moved to a separate program. Correct ?

> I will add a measurement how the XOR and the 3 top algorithms scale with the number of cores and make the object-size configurable from the command line. Anything else ? 

It would be convenient to run this from a "workunit" ( i.e. a script in ceph/qa/workunits/ ) so that it can later be run by teuthology integration tests. That could be used to show regression.

Shall I add the possiblity to test a single user specified configuration via command line arguments?
> 
I would need to play with it to comment usefully.

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux