Re: recommendations for stripe/chunk size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Keld Jørn Simonsen wrote:
Hi

I am looking at revising our howto. I see a number of places where a
chunk size of 32 kiB is recommended, and even recommendations on
maybe using sizes of 4 kiB.
Depending on the raid level, a write smaller than the chunk size causes the chunk to be read, altered, and rewritten, vs. just written if the write is a multiple of chunk size. Many filesystems by default use a 4k page size and writes. I believe this is the reasoning behind the suggestion of small chunk sizes. Sequential vs. random and raid level are important here, there's no one size to work best in all cases.
My own take on that is that this really hurts performance. Normal disks have a rotation speed of between 5400 (laptop)
7200 (ide/sata) and 10000 (SCSI) rounds per minute, giving an average
spinning time for one round of 6 to 12 ms, and average latency of half
this, that is 3 to 6 ms. Then you need to add head movement which
is something like 2 to 20 ms - in total average seek time 5 to 26 ms,
averaging around 13-17 ms.
Having a write not some multiple of chunk size would seem to require a read-alter- wait_for_disk_rotation-write, and for large sustained sequential i/o using multiple drives helps transfer. for small random i/o small chunks are good, I find little benefit to chunks over 256 or maybe 1024k.
in about 15 ms you can read on current SATA-II (300 MB/s) or ATA/133 something like between 600 to 1200 kB, actual transfer rates of
80 MB/s on SATA-II and 40 MB/s on ATA/133. So to get some bang for the buck,
and transfer some data you should have something like 256/512 kiB
chunks. With a transfer rate of 50 MB/s and chunk sizes of 256 kiB
giving about a time of 20 ms per transaction
you should be able with random reads to transfer 12 MB/s  - my
actual figures is about 30 MB/s which is possibly because of the
elevator effect of the file system driver. With a size of 4 kb per chunk you should have a time of 15 ms per transaction, or 66 transactions per second, or a transfer rate of 250 kb/s. So 256 kb vs 4 kb speeds up the transfer by a factor of 50.
If you actually see anything like this your write caching and readahead aren't doing what they should!

I actually  think the kernel should operate with block sizes
like this and not wth 4 kiB blocks. It is the readahead and the elevator
algorithms that save us from randomly reading 4 kb a time.

Exactly, and nothing save a R-A-RW cycle if the write is a partial chunk.
I also see that there are some memory constrints on this.
Having maybe 1000 processes reading, as for my mirror service,
256 kib buffers would be acceptable, occupying 256 MB RAM.
That is reasonable, and I could even tolerate 512 MB ram used.
But going to 1 MiB buffers would be overdoing it for my configuration.

What would be the recommended chunk size for todays equipment?

I think usage is more important than hardware. My opinion only.

Best regards
Keld


--
Bill Davidsen <davidsen@xxxxxxx>
 "Woe unto the statesman who makes war without a reason that will still
be valid when the war is over..." Otto von Bismark


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux