Hi Loic et al, Dan pointed me to this: http://sourceforge.net/p/snapraid/code/ci/master/tree/raid.c It has a very straight forward API and GPL license ... The implementation seems more performant than the current Jerasure library probably due to the use of ssse3 extensions and slightly less flexibility.... maybe it is worth a plugin or to become "the" plugin? It seems also worth to rewrite the xoring function I use with the sse2 assembler xor ... It comes also with a nice benchmark tool, here are the results on my 'standard' Xeon for a 4MB block with 8 disks + parity disks ....: ./snapraid -T snapraid v5.0 by Andrea Mazzoleni, http://snapraid.sourceforge.net Compiler gcc 4.8.1 CPU GenuineIntel, family 6, model 26, flags mmx sse2 ssse3 sse42 Memory is little-endian 64-bit Support nanosecond timestamps with futimens() Speed test using 8 buffers of 524288 bytes, for a total of 4096 KiB. The reported value is the sustainable aggregate bandwidth of all data disks in MiB/s (not counting parity disks). Memory write speed using the C memset() function: memset 15873 CRC used to check the content file integrity: table 857 intel 6689 Hash used to check the data blocks integrity: best murmur3 spooky2 hash spooky2 2987 6998 RAID functions used for computing the parity with 'sync': best int8 int32 int64 sse2 sse2e ssse3 ssse3e par1 sse2 6201 11080 19404 par2 sse2e 1851 3462 9949 10359 parz sse2e 1134 2020 5157 5738 par3 ssse3e 421 4766 5225 par4 ssse3e 303 3449 3844 par5 ssse3e 241 2750 2830 par6 ssse3e 198 2189 2261 RAID functions used for recovering with 'fix': best int8 ssse3 rec1 ssse3 496 1029 rec2 ssse3 208 477 rec3 ssse3 51 261 rec4 ssse3 33 170 rec5 ssse3 22 112 rec6 ssse3 16 86 ________________________________________ From: Loic Dachary [loic@xxxxxxxxxxx] Sent: 12 November 2013 19:06 To: Andreas Joachim Peters Cc: ceph-devel@xxxxxxxxxxxxxxx Subject: Re: CEPH Erasure Encoding + OSD Scalability Hi Andreas, On 12/11/2013 02:11, Andreas Joachim Peters wrote: > Hi Loic, > > I am finally doing the benchmark tool and I found a bunch of wrong parameter checks which can make the whole thing SEGV. > > All the RAID-6 codes have restrictions on the parameters but they are not correctly enforced for Liberation & Blaum-Roth codes in the CEPH wrapper class ... see text from PDF > > "Minimal Density RAID-6 codes are MDS codes based on binary matrices which satisfy a lower-bound on the number of non-zero entries. Unlike Cauchy coding, the bit-matrix elements do not correspond to elements in GF (2 w ). Instead, the bit-matrix itself has the proper MDS property. Minimal Density RAID-6 codes perform faster than Reed-Solomon and Cauchy Reed-Solomon codes for the same parameters. Liberation coding, Liber8tion coding, and Blaum-Roth coding are three examples of this kind of coding that are supported in jerasure. > > With each of these codes, m must be equal to two and k must be less than or equal to w. The value of w has restrictions based on the code: > > • With Liberation coding, w must be a prime number [Pla08b]. > • With Blaum-Roth coding, w + 1 must be a prime number [BR99]. • With Liber8tion coding, w must equal 8 [Pla08a]. > > ... > > Do you add this fixes? Nice catch. I created and assigned to myself : http://tracker.ceph.com/issues/6754 > > For the benchmark suite it runs currently 308 different configurations for the 2 algorithm which make sense from the performance point of view and provides this output: > > > # ----------------------------------------------------------------- > # Erasure Coding Benchmark - (C) CERN 2013 - Andreas.Joachim.Peters@xxxxxxx > # Ram-Size=12614856704 Allocation-Size=100000000 > # ----------------------------------------------------------------- > # [ -BENCH- ] [ ] technique=memcpy speed=5.408 [GB/s] latency=9.245 ms > # [ -BENCH- ] [ ] technique=d=a^b^c-xor speed=4.377 [GB/s] latency=17.136 ms > # [ -BENCH- ] [001/304] technique=cauchy_good:k=05:m=2:w=8:lp=0:packet=00064:size=50000000 speed=1.308 [GB/s] latency=038 [ms] size-overhead=40 [%] > .. > .. > # [ -BENCH- ] [304/304] technique=liberation:k=24:m=2:w=29:lp=2:packet=65536:size=50000000 speed=0.083 [GB/s] latency=604 [ms] size-overhead=16 [%] > # ----------------------------------------------------------------- > # Erasure Code Performance Summary:: > # ----------------------------------------------------------------- > # RAM: 12.61 GB > # Allocation-Size 0.10 GB > # ----------------------------------------------------------------- > # Byte Initialization: 29.35 MB/s > # Memcpy: 5.41 GB/s > # Triple-XOR: 4.38 GB/s > # ----------------------------------------------------------------- > # Fastest RAID6 2.72 GB/s liber8tion:k=06:m=2:w=8:lp=0:packet=04096:size=50000000 > # Fastest Triple Failure 0.96 GB/s cauchy_good:k=06:m=3:w=8:lp=0:packet=04096:size=50000000 > # Fastest Quadr. Failure 0.66 GB/s cauchy_good:k=06:m=4:w=8:lp=0:packet=04096:size=50000000 > # ----------------------------------------------------------------- > # ................................................................. > # Top 1 RAID6 2.72 GB/s liber8tion:k=06:m=2:w=8:lp=0:packet=04096:size=50000000 > # Top 2 RAID6 2.72 GB/s liber8tion:k=06:m=2:w=8:lp=0:packet=16384:size=50000000 > # Top 3 RAID6 2.64 GB/s liber8tion:k=06:m=2:w=8:lp=0:packet=65536:size=50000000 > # Top 4 RAID6 2.60 GB/s liberation:k=07:m=2:w=7:lp=0:packet=16384:size=50000000 > # Top 5 RAID6 2.59 GB/s liberation:k=05:m=2:w=7:lp=0:packet=04096:size=50000000 > # ................................................................. > # Top 1 Triple 0.96 GB/s cauchy_good:k=06:m=3:w=8:lp=0:packet=04096:size=50000000 > # Top 2 Triple 0.94 GB/s cauchy_good:k=06:m=3:w=8:lp=0:packet=16384:size=50000000 > # Top 3 Triple 0.93 GB/s cauchy_good:k=06:m=3:w=8:lp=0:packet=65536:size=50000000 > # Top 4 Triple 0.89 GB/s cauchy_good:k=07:m=3:w=8:lp=0:packet=04096:size=50000000 > # Top 5 Triple 0.87 GB/s cauchy_good:k=05:m=3:w=8:lp=0:packet=04096:size=50000000 > # ................................................................. > # Top 1 Quadr. 0.66 GB/s cauchy_good:k=06:m=4:w=8:lp=0:packet=04096:size=50000000 > # Top 2 Quadr. 0.65 GB/s cauchy_good:k=07:m=4:w=8:lp=0:packet=04096:size=50000000 > # Top 3 Quadr. 0.64 GB/s cauchy_good:k=06:m=4:w=8:lp=0:packet=16384:size=50000000 > # Top 4 Quadr. 0.64 GB/s cauchy_good:k=05:m=4:w=8:lp=0:packet=04096:size=50000000 > # Top 5 Quadr. 0.64 GB/s cauchy_good:k=06:m=4:w=8:lp=0:packet=65536:size=50000000 > # ................................................................. > > It takes around 30 second on my box. That looks great :-) If I understand correctly, it means https://github.com/ceph/ceph/pull/740 will no longer have benchmarks as they are moved to a separate program. Correct ? > I will add a measurement how the XOR and the 3 top algorithms scale with the number of cores and make the object-size configurable from the command line. Anything else ? It would be convenient to run this from a "workunit" ( i.e. a script in ceph/qa/workunits/ ) so that it can later be run by teuthology integration tests. That could be used to show regression. Shall I add the possiblity to test a single user specified configuration via command line arguments? > I would need to play with it to comment usefully. Cheers -- Loïc Dachary, Artisan Logiciel Libre -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html