Hello,
I'm looking for advise on how to best configure mdraid with 16 or more
drives. My goal is to maximize sequential read speed while maintaining a
high level of fault tolerance (2 or more failures).
I had a 16 drive raid 6 array that performed reasonably well. Peak drive
performance (check) was ~180MB/s which is not far from the native drive
performance of ~200MB/s. I recently added 4 more drives and now the
check performance is only 50MB/s per drive.
I had similar issues on another, older, system with 12 drives that I did
solve by setting group_thread_cnt to 2 but in that case the md1_resync
process was very close to 100% CPU. In this case it's only about 50%. On
this (20 drive) system, setting group_thread_cnt to 2 caused performance
to drop to 35MB/s.
md1 : active raid6 sdf1[10] sdh1[16] sdi1[17] sdb1[13] sdc1[12] sdg1[8]
sda1[15] sde1[9] sdd1[11] sds1[18] sdn1[5] sdq1[6] sdj1[1] sdp1[14]
sdk1[2] sdl1[3] sdt1[19] sdm1[0] sdr1[4] sdo1[7]
140650080768 blocks super 1.2 level 6, 128k chunk, algorithm 2
[20/20] [UUUUUUUUUUUUUUUUUUUU]
[=============>.......] check = 69.3% (5419742336/7813893376)
finish=795.6min speed=50150K/sec
bitmap: 0/4 pages [0KB], 1048576KB chunk
# cat /sys/block/md1/md/stripe_cache_size
32768
I have confirmed that this system can read from all 20 drives at top
speed (200MB/s) simultaneously by running 20 dd's in parallel. So, this
isn't a bus, CPU or memory bandwidth constraint of the system.
The array is mainly used for media files and accesses are large and
sequential. This particular host is storing backups so maintaining high
sequential performance is important in order to keep backup/restore time
to a minimum.
I recall reading that the mdraid raid6 implementation was fundamentally
single threaded and suffered from memory bandwidth constraints due to
many necessary copy operations. How does one best avoid these
constraints? Does chunk size matter?
Would raid 60 help? If I split this into two 10 drive raid6 arrays,
would each array run on separate cores?
Any advise would be appreciated. Thank you.
--Larkin