Re: scsi LLD implementation question

James Bottomley <James.Bottomley@xxxxxxx> · Fri, 25 Feb 2011 08:37:05 -0500

Cc to linux-scsi added because that's the list that best handles these
type of questions.

On Thu, 2011-02-24 at 20:13 -0800, va stg2010 wrote:
> Hi,
> I am working on a driver  for scsi initiator HBA driver for  linux.
> Have an implementation question. Once the commands are received
> into .queuecommand  callback from linux-scsi, I insert them into a
> queue maintained  locally in my driver until the responses comes back
> from target.  The responses when posted later by an interrupt handler
> are eventually processed by a kthread which "iterates"  through this
> queue to post responses back to linux-scsi.

Actually, doing actual internal queueing isn't a good idea:  two queue
confuse the block elevators and only usually serve to increase latency.
If by "queue" you just mean a list of pending commands that have already
been issued to the driver which you need to find again by some
identifier again when the interrupt driven completion is posted, then
using the block tags for this is usually optimal (depending on how many
bits you have for the completion identifier).

>  Question about this queue:
> Is it efficient to have one single queue for all the disks or  its
> more efficient to have separate queue for each disk and separate
> response processing kthreads ?

Having a kthread process responses is generally not a good idea because
completions will come in at interrupt level ... you need a context
switch to get to a thread and this costs latency.  The idea of done
processing in SCSI is to identify the scsi_cmnd as quickly as possible
and post it.  All back end SCSI processing is done in the block softirq
(a level between hard interrupt and user context), again to keep latency
low.  That also means that the kthread architecture is wrong because
it's difficult for the kernel to go hardirq->user->softirq without
adding an extra interrupt latency (usually a clock tick).

If you want a "threaded" response in a multiqueue card using MSIs, then
you bind the MSIs to CPU groups and use the hardware interrupt context
as the threading (I think drivers like lpfc already do this).  The best
performance is actually observed when the MSI comes back in on the same
CPU that issued the I/O because the cache is still hot.  The block keeps
an rq->cpu to tag this which internal HBA setup can use for programming
MSI completions.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html