On Mar 7, 2006, at 12:56 PM, Vladislav Bolkhovitin wrote:
Bryan Henderson wrote:
On Mar 2, 2006, at 11:21 AM, Vladislav Bolkhovitin wrote:
Could anyone advice how a SCSI target device can IO-throttle its
initiators, i.e. prevent them from queuing too many commands,
please?
I suppose, the best way for doing this is to inform the
initiators about the maximum queue depth X of the target device,
so any of the initiators will not send more than X commands. But
I have not found anything similar to that on INQUIRY or MODE
SENSE pages. Have I missed something? Just returning QUEUE FULL
status doesn't look to be correct, because it can lead to out of
order commands execution.
Returning QUEUE FULL status is correct, unless the initiator does
not have any pending commands on the LUN, in which case you
should return BUSY. Yes, this can lead to out-of-order execution.
That's why tapes have traditionally not used SCSI command queuing.
I'm confused, Vladislav appears to be asking about flow control
such as is built into ISCSI, wherein the ISCSI target tells the
intitiator how many tasks it's willing to work on at once and the
initiator stops sending new ones when it has hit that limit and
waits for one of the previous ones to finish. And the target can
continuously change that number.
Yes, exactly.
With the more primitive transports, I believe this is a manual
configuration step -- the target has a fixed maximum queue depth
and you tell the driver via some configuration parameter what it is.
We currently mostly deal with Fibre Channel, which seems to be a
kind of "more primitive transport" without explicit flow control.
Actually, I'm very surprised and can't believe that so advanced and
expensive technology doesn't have such basic thing as a good flow
control. Although, precisely speaking, such flow control is located
on level above transport (this is true for iSCSI as well),
therefore this is SCSI flaw, not FC.
It has X-ON and X-OFF flow control. Not bad considering it was
designed in the early 1980's.
X-OFF is TASK_SET_FULL or BUSY
X-ON is a command completing, or if busy was received because the
initiator did not have any outstanding commands at the target, then X-
ON is implied after a short time delay.
Since an intelligently-designed initiator isn't going to dump every
command to the device anyway (after all, the person writing the
initiator driver wants to have some fun implementing I/O
optimizations too; can't let those target folk have all the fun :-),
the XON/XOFF flow control isn't often invoked.
As I understand it, any system in which QUEUE FULL (that's another
name for SCSI's Task Set Full, isn't it?) errors happen is one
that is not properly configured. I saw a broken ISCSI system that
had QUEUE FULLs happening, and it was a performance disaster.
It is what we observe, too much QUEUE FULLs degrade performance
considerably.
Sounds like a broken initiator.
Apparently, hardware SCSI targets don't suffer from queuing
overflow and don't return all the time QUEUE FULL status, so the
must be a way to do the throttling more elegantly.
No, they just have big queues.
Big queues are another serious performance problem, when it means
a target accepts work faster than it can do it. I've seen that
cause initiators to send suboptimal requests (if the target
appears to be working at infinite speed, the initiator sends small
chunks of work as soon as each is ready, whereas if the initiator
can tell that the target is choked, the initiator combines and
sorts work while it waits, into a stream the target can handle
more efficiently). When systems substitute an oversized queue in
a target for initiator-target flow control, the initiator ends up
having to compensate with artificial schemes to withhold work from
a willing target (e.g. Linux "queue plugging").
This is one point why I don't like having a overbig queue on the
target.
This is just a matter of taste of whether you prefer the optimization
to be done on the initiator side or the target side. If you prefer it
to be done on the initiator side, then don't queue large amounts of
work at the target.
Another one is initiator side timeouts when the queue so big that
it could not been done on time. I described it in the previous email.
This is just a bug in the initiator. It can observe the average
service time and it knows how many commands it has queued. If it sets
its timeout anywhere close to the product of those two numbers it is
buggy.
Regards,
-Steve
--
Steve Byan <smb@xxxxxxxxxxx>
Software Architect
Egenera, Inc.
165 Forest Street
Marlboro, MA 01752
(508) 858-3125
-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html