Re: [RFC]: performance improvement by coalescing requests?

Bryan Henderson <hbryan@xxxxxxxxxx> · Mon, 20 Jun 2005 16:21:44 -0700

It's bad for a controller to suck in an arbitrarily large number of 
requests and queue them.  It makes it look to the initiator like the 
device is less busy than it is, which causes the initiator wrongly to 
conclude that coalescing would be a net loss.  The amount of I/O in flight 
from the initiator should be just enough to cover the reaction time 
between when the controller signals that it's (really) ready for more work 
and the initiator is able to deliver more, thus preventing the controller 
from running dry.

But I've run into this many times myself -- the controller looks like it 
can do I/O at channel speed for a while, and then it looks very slow for a 
while because of the tiny, poorly ordered chunks of work it was given. 
This causes people to propose various queue plugging algorithms where the 
initiator withholds work from a willing controller because it knows more 
about that controller's performance characteristics than the controller 
lets on.  I don't like any of these algorithms because they are so highly 
dependent on what goes on inside the controller and the work arrival 
pattern.  There's no heuristic you can choose that works for even all the 
common scenarios, let alone the scenarios that we haven't thought of yet. 
I generally override Linux block layer queue plugging (by explicitly 
unplugging every time I put something in the queue).

If you have a continual flow of work, you can solve the problem just by 
enlarging upstream queues to make sure you swamp whatever queue capacity 
the controller has.  I've done that by increasing the number of threads 
processing files, for example.  If you have a bursty 
response-time-sensitive workload (e.g. a small number of threads doing 
small synchronous reads, well below the capacity of the controller), it's 
much harder.

Of course, the cleanest thing to do is to reduce the size of the queue in 
the controller to the reaction time window I described above or have the 
controller adjust it dynamically.

But when you don't have that luxury, doing the same thing via the Linux 
driver queue depth (this patch) seems like a great substitute to me.

Why is it just for sequential?  If your pipeline (device driver -> 
controller -> device) is always full, what's to lose by backing up 
requests that can't be coalesced?

By the way, a disk controller in theory is a better place to do coalescing 
than a Linux queue -- the controller should know more about the ideal 
ordering and clustering of reads and writes.  Linux should have to 
coalesce only where small I/Os saturate the pipe to the controller.  But I 
assume we're talking about controllers that execute each received I/O 
separately.

--
Bryan Henderson                     IBM Almaden Research Center
San Jose CA                         Filesystems 
-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html