On Fri, Feb 15, 2019 at 8:36 AM Roy Sigurd Karlsbakk <roy@xxxxxxxxxxxxx> wrote: > > >> Greetings ! > >> > >> I created a MD RAID6 with a 512KiB chunk size out of 12 8TB drives, no internal > >> bitmap and no journal on quad xeon gold 6154 running kernel 4.18 (Ubuntu > >> 18.04.1) and set FIO to do a 1TiB sequential write to the device with a block > >> size of 5M, 3 processes and a QD of 64. Why using 3 processes? > >> > >> Each drive being able to achieve 215MiB/s at the beginning of the drive, I > >> expected the output to be somewhere around the 2GiB/s mark at the beginning of > >> the raid array. > >> After setting stripe_cache_size to 32768 and group_thread_cnt to 2, I only got > >> an average 1.4GiB/s out of my array and the throughput wasn't very stable. Bigger stripe_cache_size may not always give better performance. Same for group_thread_cnt. Some more tuning may give better performance. > >> > >> I did the same test against a hardware raid controller, the Broadcom MegaRAID > >> 9480-8i8e, and it managed a nice flat 1.9 GiB/s. > >> > >> I expected a modern cpu to easily win over a hardware controller but that wasn't > >> the case. > >> Am I missing something ? > > > > At a wag... the 4GB ram cache on the raid card causing it to appear as > > if the disk access is faster? > > > > I have to be honest, I've long since given up trying to test the > > performance of raid formats/layouts/chunks/etc... due to the multiple > > ways the system can "do stuff" that changes the results with even the > > exact same manual style tests. Then again, my workloads tend to be "good > > enough, is good enough". I guess, however, someone needing a high speed > > file server bonded 10Gb links to multiple workstations running video > > file editing software would be a whole different ballgame. > > Well, something is bound to be wrong here when a RAID card is faster than using a far faster CPU for the work, with faster memory etc. Does anyone know how this can be debugged or fixed? Is there a possibility to choose which to use from SSE/AVX? I think the kernel will choose best instruction of SSE/AVX. dmesg will show something like [ 0.233184] raid6: sse2x1 gen() 8003 MB/s [ 0.250192] raid6: sse2x1 xor() 5982 MB/s [ 0.267208] raid6: sse2x2 gen() 10003 MB/s [ 0.284227] raid6: sse2x2 xor() 6937 MB/s [ 0.301242] raid6: sse2x4 gen() 12187 MB/s [ 0.318260] raid6: sse2x4 xor() 8029 MB/s [ 0.318427] raid6: using algorithm sse2x4 gen() 12187 MB/s [ 0.318639] raid6: .... xor() 8029 MB/s, rmw enabled [ 0.318833] raid6: using ssse3x2 recovery algorithm If I am debugging this, I will first make sure the array is doing 100% full stripe writes (check read/write from iostat or similar tool). > > Vennlig hilsen > > roy > -- > Roy Sigurd Karlsbakk > (+47) 98013356 > http://blogg.karlsbakk.net/ > GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt > -- > Hið góða skaltu í stein höggva, hið illa í snjó rita. >