It's bad for a controller to suck in an arbitrarily large number of requests and queue them. It makes it look to the initiator like the device is less busy than it is, which causes the initiator wrongly to conclude that coalescing would be a net loss. The amount of I/O in flight from the initiator should be just enough to cover the reaction time between when the controller signals that it's (really) ready for more work and the initiator is able to deliver more, thus preventing the controller from running dry. But I've run into this many times myself -- the controller looks like it can do I/O at channel speed for a while, and then it looks very slow for a while because of the tiny, poorly ordered chunks of work it was given. This causes people to propose various queue plugging algorithms where the initiator withholds work from a willing controller because it knows more about that controller's performance characteristics than the controller lets on. I don't like any of these algorithms because they are so highly dependent on what goes on inside the controller and the work arrival pattern. There's no heuristic you can choose that works for even all the common scenarios, let alone the scenarios that we haven't thought of yet. I generally override Linux block layer queue plugging (by explicitly unplugging every time I put something in the queue). If you have a continual flow of work, you can solve the problem just by enlarging upstream queues to make sure you swamp whatever queue capacity the controller has. I've done that by increasing the number of threads processing files, for example. If you have a bursty response-time-sensitive workload (e.g. a small number of threads doing small synchronous reads, well below the capacity of the controller), it's much harder. Of course, the cleanest thing to do is to reduce the size of the queue in the controller to the reaction time window I described above or have the controller adjust it dynamically. But when you don't have that luxury, doing the same thing via the Linux driver queue depth (this patch) seems like a great substitute to me. Why is it just for sequential? If your pipeline (device driver -> controller -> device) is always full, what's to lose by backing up requests that can't be coalesced? By the way, a disk controller in theory is a better place to do coalescing than a Linux queue -- the controller should know more about the ideal ordering and clustering of reads and writes. Linux should have to coalesce only where small I/Os saturate the pipe to the controller. But I assume we're talking about controllers that execute each received I/O separately. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html