On 1/16/07, Jeff Garzik <jeff@xxxxxxxxxx> wrote:
ISTR either Jens or Andrew ran some numbers, and found that there was little utility beyond 4 or 8 tags or so.
Write cache is effectively queueing small writes already, so NCQ simply brings random read performance closer to writes. I know on the Maxtor drives with ~16MB of cache, they could do almost 200 ops/s at 7200RPM with their buffer granularity. Random reads were about 70 ops/s at a depth of 1, and 120 ops/s at a depth of 32. Every double of queue depth added another level of performance, and brings it closer to the implementation of cached writes (queued or unqueued). (infinite queue depth basically eliminates seek and rotate time, and brings you to your minimum settle criteria as your minimum operation time) It really has a lot of application dependence, but for the mixed random workloads, a 25-30% performance increase was common in our testing. Drives should be able to handle normal streaming workloads at identical performance, with or without queueing, since the patterns are so easy to detect. If done properly, queueing should never hurt performance. High queue depths will increase average latency of course, but shouldn't hurt overall performance. --eric
NCQ mainly helps with multiple threads doing reads. Writes are largely asynchronous to the user already (except for fsync-style writes). You want to be able to stuff the disk's internal elevator with as many read requests as possible, because reads are very often synchronous -- most apps (1) read a block, (2) do something, (3) goto step #1. The kernel's elevator isn't much use in these cases.
True. And internal to the drive, normal elevator is "meh." There are other algorithms for scheduling that perform better. - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html