On Wednesday November 13, jakob@unthought.net wrote: > On Wed, Nov 13, 2002 at 02:33:46PM +1100, Neil Brown wrote: > ... > > > The benchmark goes: > > > > > > | some tests on raid5 with 4k and 128k chunk size. The results are as follows: > > > | Access Spec 4K(MBps) 4K-deg(MBps) 128K(MBps) 128K-deg(MBps) > > > | 2K Seq Read 23.015089 33.293993 25.415035 32.669278 > > > | 2K Seq Write 27.363041 30.555328 14.185889 16.087862 > > > | 64K Seq Read 22.952559 44.414774 26.02711 44.036993 > > > | 64K Seq Write 25.171833 32.67759 13.97861 15.618126 > > > These numbers look ... interesting. I might try to reproduce them myself. > > > So down from 27MB/sec to 14MB/sec running 2k-block sequential writes on > > > a 128k chunk array versus a 4k chunk array (non-degraded). > > > > When doing sequential writes, a small chunk size means you are more > > likely to fill up a whole stripe before data is flushed to disk, so it > > is very possible that you wont need to pre-read parity at all. With a > > larger chunksize, it is more likely that you will have to write, and > > possibly read, the parity block several times. > > Except if one worked on 4k sub-chunks - right ? :) I still don't understand.... We *do* work with 4k subchunks. > > > > > So if you are doing single threaded sequential accesses, a smaller > > chunk size is definately better. > > Definitely not so for reads - the seeking past the parity blocks ruin > sequential read performance when we do many such seeks (eg. when we have > small chunks) - as witnessed by the benchmark data above. Parity blocks aren't big enough to have to seek past. I would imagine that a modern drive would read a whole track into cache on the first read request, and then find the required data, just past the parity block, in the cache on the second request. By maybe I'm wrong. Or there could be some factor in the device driver where lots of little read request, even though they are almost consecutive, are handled more poorly than a few large read requests. I wonder if it would be worth reading those parity blocks anyway if a sequential read were detected.... > > > If you are doing lots of parallel accesses (typical multi-user work > > load), small chunk sizes tends to mean that every access goes to all > > drives so there is lots of contention. In theory a larger chunk size > > means that more accesses will be entirely satisfied from just one disk, > > so there it more opportunity for concurrency between the different > > users. > > > > As always, the best way to choose a chunk size is develop a realistic > > work load and test it against several different chunk sizes. There > > is no rule like "bigger is better" or "smaller is better". > > For a single reader/writer, it was pretty obvious from the above that > "big is good" for reads (because of the fewer parity block skip seeks), > and "small is good" for writes. > > So, by making a big chunk-sized array, and having it work on 4k > sub-chunks for writes, was some idea I had which I felt would just give > the best scenario in both cases. The issue isn't so much the IO size as the layout of disk. You cannot use one layout for read and a different layout for write. That obviously doesn't make sense. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html