Re: [PATCH, RFC]: move full command queue handling into fabric drivers

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Tue, 29 Nov 2011 23:27:13 -0800

On Tue, 2011-11-29 at 03:13 -0500, Christoph Hellwig wrote:
> Currently there is a lot of code in the target core to deal with the fact
> that the qla2xxx might return errors from ->queue_data_in and ->queue_status
> because its internal command queues.  The code adds commands that failed in
> this way onto a per-device list, then immediately schedules a work struct
> to add them to the per-device command queue, which then retries the action.
> In short it's a very complex busy loop.  This patch moves the busy looping
> into qla2xxx to get rid of all this code.  While this still isn't optimal
> it gets rid of a lot of useless core complexity, and allows to add a
> waitqueue to the qla2xxx host queue to actually allow targeted wakeups.
> 
> I brifely looked into implementing these proper wakeups, but I failed to
> understand the related code in qla2xxx - someone more familar with the
> hardware really should look into waking up callers once we actually have
> free command slots.
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> 
> ---
>  drivers/scsi/qla2xxx/qla_target.c               |   19 ++
>  drivers/target/target_core_transport.c          |  185 +-----------------------
>  drivers/target/tcm_qla2xxx/tcm_qla2xxx_fabric.c |    3 
>  include/target/target_core_base.h               |    7 
>  4 files changed, 32 insertions(+), 182 deletions(-)
> 

So I'm still undecided if pushing this completely out into fabric
drivers in order to drop the current complexity of queue full with
target-core is a better tradeoff for not..

This type of core internal queue-full handling is used by ib_srpt and
other fabrics potentially using ACK_KREF, so I think we want to at least
still handle generically for fabrics (if necessary) that need it atm.

> Index: lio-core/drivers/scsi/qla2xxx/qla_target.c
> ===================================================================
> --- lio-core.orig/drivers/scsi/qla2xxx/qla_target.c	2011-11-29 08:57:01.607469832 +0100
> +++ lio-core/drivers/scsi/qla2xxx/qla_target.c	2011-11-29 08:59:27.223347631 +0100
> @@ -2317,15 +2317,28 @@ static inline void qla_tgt_check_srr_deb
>  int qla_tgt_xmit_response(struct qla_tgt_cmd *cmd, int xmit_type, uint8_t scsi_status)
>  {
>  	qla_tgt_check_srr_debug(cmd, &xmit_type);
> +	int error;
>  
> +again:
>  	ql_dbg(ql_dbg_tgt, cmd->vha, 0xe017, "is_send_status=%d,"
>  		" cmd->bufflen=%d, cmd->sg_cnt=%d, cmd->dma_data_direction=%d",
>  		(xmit_type & QLA_TGT_XMIT_STATUS) ? 1 : 0, cmd->bufflen,
>  		cmd->sg_cnt, cmd->dma_data_direction);
>  
> -	return (IS_FWI2_CAPABLE(cmd->tgt->ha)) ?
> -		__qla_tgt_24xx_xmit_response(cmd, xmit_type, scsi_status) :
> -		__qla_tgt_2xxx_xmit_response(cmd, xmit_type, scsi_status);
> +	if (IS_FWI2_CAPABLE(cmd->tgt->ha))
> +		error = __qla_tgt_24xx_xmit_response(cmd, xmit_type, scsi_status);
> +	else
> +		error = __qla_tgt_2xxx_xmit_response(cmd, xmit_type, scsi_status);
> +
> +	if (error == -EAGAIN || error == -ENOMEM) {
> +		/*
> +		 * XXX: we should use a waitqueue for the actual resource here.
> +		 */
> +		msleep(1);
> +		goto again;
> +	}
> +
> +	return error;
>  }
>  EXPORT_SYMBOL(qla_tgt_xmit_response);
>  

Note outgoing TFO->write_pending() -> qla_tgt_rdy_to_xfer() failures
will also need to be addressed for this to work as expected for FCP
WRITE.

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html