On Sat, 2023-08-05 at 14:16 -0400, Laurence Oberman wrote: > On Sat, 2023-08-05 at 12:54 -0400, Laurence Oberman wrote: > > On Fri, 2023-08-04 at 17:35 -0400, Laurence Oberman wrote: > > > On Fri, 2023-08-04 at 14:35 -0400, Laurence Oberman wrote: > > > > On Fri, 2023-08-04 at 13:51 -0400, Laurence Oberman wrote: > > > > > On Fri, 2023-08-04 at 12:58 -0400, Laurence Oberman wrote: > > > > > > On Fri, 2023-08-04 at 12:49 +0530, Nilesh Javali wrote: > > > > > > > From: Quinn Tran <qutran@xxxxxxxxxxx> > > > > > > > > > > > > > > System crash when 32 bytes CDB was sent to a non T10-PI > > > > > > > disk, > > > > > > > > > > > > > > [ 177.143279] ? qla2xxx_dif_start_scsi_mq+0xcd8/0xce0 > > > > > > > [qla2xxx] > > > > > > > [ 177.149165] ? internal_add_timer+0x42/0x70 > > > > > > > [ 177.153372] qla2xxx_mqueuecommand+0x207/0x2b0 > > > > > > > [qla2xxx] > > > > > > > [ 177.158730] scsi_queue_rq+0x2b7/0xc00 > > > > > > > [ 177.162501] blk_mq_dispatch_rq_list+0x3ea/0x7e0 > > > > > > > > > > > > > > Current code attempt to use CRC IOCB to send the command > > > > > > > but > > > > > > > failed. > > > > > > > Instead, type 6 IOCB should be used to send the IO. > > > > > > > > > > > > > > Clone existing type 6 IOCB code with addition of MQ > > > > > > > support > > > > > > > to allow 32 bytes CDB to go through. > > > > > > > > > > > > > > Signed-off-by: Quinn Tran <qutran@xxxxxxxxxxx> > > > > > > > Cc: Laurence Oberman <loberman@xxxxxxxxxx> > > > > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > > > > Signed-off-by: Nilesh Javali <njavali@xxxxxxxxxxx> > > > > > > > --- > > > > > > > drivers/scsi/qla2xxx/qla_iocb.c | 270 > > > > > > > ++++++++++++++++++++++++++++++++ > > > > > > > drivers/scsi/qla2xxx/qla_nx.h | 4 +- > > > > > > > 2 files changed, 273 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/drivers/scsi/qla2xxx/qla_iocb.c > > > > > > > b/drivers/scsi/qla2xxx/qla_iocb.c > > > > > > > index 0caa64a7df26..e99ebf7e1c7a 100644 > > > > > > > --- a/drivers/scsi/qla2xxx/qla_iocb.c > > > > > > > +++ b/drivers/scsi/qla2xxx/qla_iocb.c > > > > > > > @@ -11,6 +11,7 @@ > > > > > > > > > > > > > > #include <scsi/scsi_tcq.h> > > > > > > > > > > > > > > +static int qla_start_scsi_type6(srb_t *sp); > > > > > > > /** > > > > > > > * qla2x00_get_cmd_direction() - Determine control_flag > > > > > > > data > > > > > > > direction. > > > > > > > * @sp: SCSI command > > > > > > > @@ -1721,6 +1722,8 @@ qla24xx_dif_start_scsi(srb_t *sp) > > > > > > > if (scsi_get_prot_op(cmd) == SCSI_PROT_NORMAL) { > > > > > > > if (cmd->cmd_len <= 16) > > > > > > > return qla24xx_start_scsi(sp); > > > > > > > + else > > > > > > > + return qla_start_scsi_type6(sp); > > > > > > > } > > > > > > > > > > > > > > /* Setup device pointers. */ > > > > > > > @@ -2100,6 +2103,8 @@ qla2xxx_dif_start_scsi_mq(srb_t > > > > > > > *sp) > > > > > > > if (scsi_get_prot_op(cmd) == SCSI_PROT_NORMAL) { > > > > > > > if (cmd->cmd_len <= 16) > > > > > > > return qla2xxx_start_scsi_mq(sp); > > > > > > > + else > > > > > > > + return qla_start_scsi_type6(sp); > > > > > > > } > > > > > > > > > > > > > > spin_lock_irqsave(&qpair->qp_lock, flags); > > > > > > > @@ -4205,3 +4210,268 @@ qla2x00_start_bidir(srb_t *sp, > > > > > > > struct > > > > > > > scsi_qla_host *vha, uint32_t tot_dsds) > > > > > > > > > > > > > > return rval; > > > > > > > } > > > > > > > + > > > > > > > +/** > > > > > > > + * qla_start_scsi_type6() - Send a SCSI command to the > > > > > > > ISP > > > > > > > + * @sp: command to send to the ISP > > > > > > > + * > > > > > > > + * Returns non-zero if a failure occurred, else zero. > > > > > > > + */ > > > > > > > +static int > > > > > > > +qla_start_scsi_type6(srb_t *sp) > > > > > > > +{ > > > > > > > + int nseg; > > > > > > > + unsigned long flags; > > > > > > > + uint32_t *clr_ptr; > > > > > > > + uint32_t handle; > > > > > > > + struct cmd_type_6 *cmd_pkt; > > > > > > > + uint16_t cnt; > > > > > > > + uint16_t req_cnt; > > > > > > > + uint16_t tot_dsds; > > > > > > > + struct req_que *req = NULL; > > > > > > > + struct rsp_que *rsp; > > > > > > > + struct scsi_cmnd *cmd = GET_CMD_SP(sp); > > > > > > > + struct scsi_qla_host *vha = sp->fcport->vha; > > > > > > > + struct qla_hw_data *ha = vha->hw; > > > > > > > + struct qla_qpair *qpair = sp->qpair; > > > > > > > + uint16_t more_dsd_lists = 0; > > > > > > > + struct dsd_dma *dsd_ptr; > > > > > > > + uint16_t i; > > > > > > > + __be32 *fcp_dl; > > > > > > > + uint8_t additional_cdb_len; > > > > > > > + struct ct6_dsd *ctx; > > > > > > > + > > > > > > > + > > > > > > > + /* Acquire qpair specific lock */ > > > > > > > + spin_lock_irqsave(&qpair->qp_lock, flags); > > > > > > > + > > > > > > > + /* Setup qpair pointers */ > > > > > > > + req = qpair->req; > > > > > > > + rsp = qpair->rsp; > > > > > > > + > > > > > > > + /* So we know we haven't pci_map'ed anything yet > > > > > > > */ > > > > > > > + tot_dsds = 0; > > > > > > > + > > > > > > > + /* Send marker if required */ > > > > > > > + if (vha->marker_needed != 0) { > > > > > > > + if (__qla2x00_marker(vha, qpair, 0, 0, > > > > > > > MK_SYNC_ALL) > > > > > > > != QLA_SUCCESS) { > > > > > > > + spin_unlock_irqrestore(&qpair- > > > > > > > > qp_lock, > > > > > > > flags); > > > > > > > + return QLA_FUNCTION_FAILED; > > > > > > > + } > > > > > > > + vha->marker_needed = 0; > > > > > > > + } > > > > > > > + > > > > > > > + handle = qla2xxx_get_next_handle(req); > > > > > > > + if (handle == 0) > > > > > > > + goto queuing_error; > > > > > > > + > > > > > > > + /* Map the sg table so we have an accurate count > > > > > > > of > > > > > > > sg > > > > > > > entries needed */ > > > > > > > + if (scsi_sg_count(cmd)) { > > > > > > > + nseg = dma_map_sg(&ha->pdev->dev, > > > > > > > scsi_sglist(cmd), > > > > > > > + scsi_sg_count(cmd), > > > > > > > cmd- > > > > > > > > sc_data_direction); > > > > > > > + if (unlikely(!nseg)) > > > > > > > + goto queuing_error; > > > > > > > + } else { > > > > > > > + nseg = 0; > > > > > > > + } > > > > > > > + > > > > > > > + tot_dsds = nseg; > > > > > > > + > > > > > > > + /* eventhough driver only need 1 T6 IOCB, FW > > > > > > > still > > > > > > > convert > > > > > > > DSD to Continueation IOCB */ > > > > > > > + req_cnt = qla24xx_calc_iocbs(vha, tot_dsds); > > > > > > > + > > > > > > > + sp->iores.res_type = RESOURCE_IOCB | > > > > > > > RESOURCE_EXCH; > > > > > > > + sp->iores.exch_cnt = 1; > > > > > > > + sp->iores.iocb_cnt = req_cnt; > > > > > > > + > > > > > > > + if (qla_get_fw_resources(sp->qpair, &sp->iores)) > > > > > > > + goto queuing_error; > > > > > > > + > > > > > > > + more_dsd_lists = > > > > > > > qla24xx_calc_dsd_lists(tot_dsds); > > > > > > > + if ((more_dsd_lists + qpair->dsd_inuse) >= > > > > > > > NUM_DSD_CHAIN) > > > > > > > { > > > > > > > + ql_dbg(ql_dbg_io, vha, 0x3028, > > > > > > > + "Num of DSD list %d is than %d for > > > > > > > cmd=%p.\n", > > > > > > > + more_dsd_lists + qpair->dsd_inuse, > > > > > > > NUM_DSD_CHAIN, cmd); > > > > > > > + goto queuing_error; > > > > > > > + } > > > > > > > + > > > > > > > + if (more_dsd_lists <= qpair->dsd_avail) > > > > > > > + goto sufficient_dsds; > > > > > > > + else > > > > > > > + more_dsd_lists -= qpair->dsd_avail; > > > > > > > + > > > > > > > + for (i = 0; i < more_dsd_lists; i++) { > > > > > > > + dsd_ptr = kzalloc(sizeof(*dsd_ptr), > > > > > > > GFP_ATOMIC); > > > > > > > + if (!dsd_ptr) { > > > > > > > + ql_log(ql_log_fatal, vha, 0x3029, > > > > > > > + "Failed to allocate memory > > > > > > > for > > > > > > > dsd_dma > > > > > > > for cmd=%p.\n", cmd); > > > > > > > + goto queuing_error; > > > > > > > + } > > > > > > > + INIT_LIST_HEAD(&dsd_ptr->list); > > > > > > > + > > > > > > > + dsd_ptr->dsd_addr = dma_pool_alloc(ha- > > > > > > > > dl_dma_pool, > > > > > > > + GFP_ATOMIC, &dsd_ptr- > > > > > > > > dsd_list_dma); > > > > > > > + if (!dsd_ptr->dsd_addr) { > > > > > > > + kfree(dsd_ptr); > > > > > > > + ql_log(ql_log_fatal, vha, 0x302a, > > > > > > > + "Failed to allocate memory > > > > > > > for > > > > > > > dsd_addr > > > > > > > for cmd=%p.\n", cmd); > > > > > > > + goto queuing_error; > > > > > > > + } > > > > > > > + list_add_tail(&dsd_ptr->list, &qpair- > > > > > > > > dsd_list); > > > > > > > + qpair->dsd_avail++; > > > > > > > + } > > > > > > > + > > > > > > > +sufficient_dsds: > > > > > > > + req_cnt = 1; > > > > > > > + > > > > > > > + if (req->cnt < (req_cnt + 2)) { > > > > > > > + if (IS_SHADOW_REG_CAPABLE(ha)) { > > > > > > > + cnt = *req->out_ptr; > > > > > > > + } else { > > > > > > > + cnt = > > > > > > > (uint16_t)rd_reg_dword_relaxed(req- > > > > > > > > req_q_out); > > > > > > > + if > > > > > > > (qla2x00_check_reg16_for_disconnect(vha, > > > > > > > cnt)) > > > > > > > + goto queuing_error; > > > > > > > + } > > > > > > > + > > > > > > > + if (req->ring_index < cnt) > > > > > > > + req->cnt = cnt - req->ring_index; > > > > > > > + else > > > > > > > + req->cnt = req->length - (req- > > > > > > > > ring_index > > > > > > > - > > > > > > > cnt); > > > > > > > + if (req->cnt < (req_cnt + 2)) > > > > > > > + goto queuing_error; > > > > > > > + } > > > > > > > + > > > > > > > + ctx = &sp->u.scmd.ct6_ctx; > > > > > > > + > > > > > > > + memset(ctx, 0, sizeof(struct ct6_dsd)); > > > > > > > + ctx->fcp_cmnd = dma_pool_zalloc(ha- > > > > > > > > fcp_cmnd_dma_pool, > > > > > > > + GFP_ATOMIC, &ctx->fcp_cmnd_dma); > > > > > > > + if (!ctx->fcp_cmnd) { > > > > > > > + ql_log(ql_log_fatal, vha, 0x3031, > > > > > > > + "Failed to allocate fcp_cmnd for > > > > > > > cmd=%p.\n", > > > > > > > cmd); > > > > > > > + goto queuing_error; > > > > > > > + } > > > > > > > + > > > > > > > + /* Initialize the DSD list and dma handle */ > > > > > > > + INIT_LIST_HEAD(&ctx->dsd_list); > > > > > > > + ctx->dsd_use_cnt = 0; > > > > > > > + > > > > > > > + if (cmd->cmd_len > 16) { > > > > > > > + additional_cdb_len = cmd->cmd_len - 16; > > > > > > > + if (cmd->cmd_len % 4 || > > > > > > > + cmd->cmd_len > QLA_CDB_BUF_SIZE) { > > > > > > > + /* SCSI command bigger than 16 > > > > > > > bytes > > > > > > > must > > > > > > > be > > > > > > > + * multiple of 4 or too big. > > > > > > > + */ > > > > > > > + ql_log(ql_log_warn, vha, 0x3033, > > > > > > > + "scsi cmd len %d not multiple > > > > > > > of > > > > > > > 4 > > > > > > > for > > > > > > > cmd=%p.\n", > > > > > > > + cmd->cmd_len, cmd); > > > > > > > + goto queuing_error_fcp_cmnd; > > > > > > > + } > > > > > > > + ctx->fcp_cmnd_len = 12 + cmd->cmd_len + > > > > > > > 4; > > > > > > > + } else { > > > > > > > + additional_cdb_len = 0; > > > > > > > + ctx->fcp_cmnd_len = 12 + 16 + 4; > > > > > > > + } > > > > > > > + > > > > > > > + /* Build command packet. */ > > > > > > > + req->current_outstanding_cmd = handle; > > > > > > > + req->outstanding_cmds[handle] = sp; > > > > > > > + sp->handle = handle; > > > > > > > + cmd->host_scribble = (unsigned char *)(unsigned > > > > > > > long)handle; > > > > > > > + req->cnt -= req_cnt; > > > > > > > + > > > > > > > + cmd_pkt = (struct cmd_type_6 *)req->ring_ptr; > > > > > > > + cmd_pkt->handle = make_handle(req->id, handle); > > > > > > > + > > > > > > > + /* Zero out remaining portion of packet. */ > > > > > > > + /* tagged queuing modifier -- default is > > > > > > > TSK_SIMPLE > > > > > > > (0). > > > > > > > */ > > > > > > > + clr_ptr = (uint32_t *)cmd_pkt + 2; > > > > > > > + memset(clr_ptr, 0, REQUEST_ENTRY_SIZE - 8); > > > > > > > + cmd_pkt->dseg_count = cpu_to_le16(tot_dsds); > > > > > > > + > > > > > > > + /* Set NPORT-ID and LUN number*/ > > > > > > > + cmd_pkt->nport_handle = cpu_to_le16(sp->fcport- > > > > > > > > loop_id); > > > > > > > + cmd_pkt->port_id[0] = sp->fcport->d_id.b.al_pa; > > > > > > > + cmd_pkt->port_id[1] = sp->fcport->d_id.b.area; > > > > > > > + cmd_pkt->port_id[2] = sp->fcport->d_id.b.domain; > > > > > > > + cmd_pkt->vp_index = sp->vha->vp_idx; > > > > > > > + > > > > > > > + /* Build IOCB segments */ > > > > > > > + qla24xx_build_scsi_type_6_iocbs(sp, cmd_pkt, > > > > > > > tot_dsds); > > > > > > > + > > > > > > > + int_to_scsilun(cmd->device->lun, &cmd_pkt->lun); > > > > > > > + host_to_fcp_swap((uint8_t *)&cmd_pkt->lun, > > > > > > > sizeof(cmd_pkt- > > > > > > > > lun)); > > > > > > > + > > > > > > > + /* build FCP_CMND IU */ > > > > > > > + int_to_scsilun(cmd->device->lun, &ctx->fcp_cmnd- > > > > > > > > lun); > > > > > > > + ctx->fcp_cmnd->additional_cdb_len = > > > > > > > additional_cdb_len; > > > > > > > + > > > > > > > + if (cmd->sc_data_direction == DMA_TO_DEVICE) > > > > > > > + ctx->fcp_cmnd->additional_cdb_len |= 1; > > > > > > > + else if (cmd->sc_data_direction == > > > > > > > DMA_FROM_DEVICE) > > > > > > > + ctx->fcp_cmnd->additional_cdb_len |= 2; > > > > > > > + > > > > > > > + /* Populate the FCP_PRIO. */ > > > > > > > + if (ha->flags.fcp_prio_enabled) > > > > > > > + ctx->fcp_cmnd->task_attribute |= > > > > > > > + sp->fcport->fcp_prio << 3; > > > > > > > + > > > > > > > + memcpy(ctx->fcp_cmnd->cdb, cmd->cmnd, cmd- > > > > > > > > cmd_len); > > > > > > > + > > > > > > > + fcp_dl = (__be32 *)(ctx->fcp_cmnd->cdb + 16 + > > > > > > > + additional_cdb_len); > > > > > > > + *fcp_dl = htonl((uint32_t)scsi_bufflen(cmd)); > > > > > > > + > > > > > > > + cmd_pkt->fcp_cmnd_dseg_len = cpu_to_le16(ctx- > > > > > > > > fcp_cmnd_len); > > > > > > > + put_unaligned_le64(ctx->fcp_cmnd_dma, > > > > > > > + &cmd_pkt- > > > > > > > > fcp_cmnd_dseg_address); > > > > > > > + > > > > > > > + sp->flags |= SRB_FCP_CMND_DMA_VALID; > > > > > > > + cmd_pkt->byte_count = > > > > > > > cpu_to_le32((uint32_t)scsi_bufflen(cmd)); > > > > > > > + /* Set total data segment count. */ > > > > > > > + cmd_pkt->entry_count = (uint8_t)req_cnt; > > > > > > > + > > > > > > > + wmb(); > > > > > > > + /* Adjust ring index. */ > > > > > > > + req->ring_index++; > > > > > > > + if (req->ring_index == req->length) { > > > > > > > + req->ring_index = 0; > > > > > > > + req->ring_ptr = req->ring; > > > > > > > + } else { > > > > > > > + req->ring_ptr++; > > > > > > > + } > > > > > > > + > > > > > > > + sp->qpair->cmd_cnt++; > > > > > > > + sp->flags |= SRB_DMA_VALID; > > > > > > > + > > > > > > > + /* Set chip new ring index. */ > > > > > > > + wrt_reg_dword(req->req_q_in, req->ring_index); > > > > > > > + > > > > > > > + /* Manage unprocessed RIO/ZIO commands in > > > > > > > response > > > > > > > queue. > > > > > > > */ > > > > > > > + if (vha->flags.process_response_queue && > > > > > > > + rsp->ring_ptr->signature != > > > > > > > RESPONSE_PROCESSED) > > > > > > > + qla24xx_process_response_queue(vha, rsp); > > > > > > > + > > > > > > > + spin_unlock_irqrestore(&qpair->qp_lock, flags); > > > > > > > + > > > > > > > + return QLA_SUCCESS; > > > > > > > + > > > > > > > +queuing_error_fcp_cmnd: > > > > > > > + dma_pool_free(ha->fcp_cmnd_dma_pool, ctx- > > > > > > > >fcp_cmnd, > > > > > > > ctx- > > > > > > > > fcp_cmnd_dma); > > > > > > > + > > > > > > > +queuing_error: > > > > > > > + if (tot_dsds) > > > > > > > + scsi_dma_unmap(cmd); > > > > > > > + > > > > > > > + qla_put_fw_resources(sp->qpair, &sp->iores); > > > > > > > + > > > > > > > + if (sp->u.scmd.crc_ctx) { > > > > > > > + mempool_free(sp->u.scmd.crc_ctx, ha- > > > > > > > > ctx_mempool); > > > > > > > + sp->u.scmd.crc_ctx = NULL; > > > > > > > + } > > > > > > > + > > > > > > > + spin_unlock_irqrestore(&qpair->qp_lock, flags); > > > > > > > + > > > > > > > + return QLA_FUNCTION_FAILED; > > > > > > > +} > > > > > > > diff --git a/drivers/scsi/qla2xxx/qla_nx.h > > > > > > > b/drivers/scsi/qla2xxx/qla_nx.h > > > > > > > index 6dc80c8ddf79..5d1bdc15b75c 100644 > > > > > > > --- a/drivers/scsi/qla2xxx/qla_nx.h > > > > > > > +++ b/drivers/scsi/qla2xxx/qla_nx.h > > > > > > > @@ -857,7 +857,9 @@ struct fcp_cmnd { > > > > > > > uint8_t task_attribute; > > > > > > > uint8_t task_management; > > > > > > > uint8_t additional_cdb_len; > > > > > > > - uint8_t cdb[260]; /* 256 for CDB len and 4 for > > > > > > > FCP_DL > > > > > > > */ > > > > > > > +#define QLA_CDB_BUF_SIZE 256 > > > > > > > +#define QLA_FCP_DL_SIZE 4 > > > > > > > + uint8_t cdb[QLA_CDB_BUF_SIZE + QLA_FCP_DL_SIZE]; > > > > > > > /* > > > > > > > 256 > > > > > > > for > > > > > > > CDB len and 4 for FCP_DL */ > > > > > > > }; > > > > > > > > > > > > > > struct dsd_dma { > > > > > > > > > > > > QT and Nilesh, Thank you. > > > > > > I need more time to look at this fully but I will get it > > > > > > tested > > > > > > in > > > > > > the > > > > > > meantime. > > > > > > I will reply with a review or questions later. > > > > > > > > > > > > Thanks for sending this > > > > > > Laurence > > > > > > > > > > @QT and Nilesh > > > > > Which kernel was this against. > > > > > Did you guys test this only agains your driver or also > > > > > upstream > > > > > I cannot build it clean because There is no structure member > > > > > dsd_inuse > > > > > > > > > > Can you send one for upstream please > > > > > > > > > > Patch applied fine to 6.5-rc4 > > > > > > > > > > [loberman@segstorage3 linux]$ patch -p1 < qt.patch > > > > > patching file drivers/scsi/qla2xxx/qla_iocb.c > > > > > Hunk #2 succeeded at 1718 (offset -4 lines). > > > > > Hunk #3 succeeded at 2099 (offset -4 lines). > > > > > Hunk #4 succeeded at 4179 (offset -31 lines). > > > > > patching file drivers/scsi/qla2xxx/qla_nx.h > > > > > > > > > > > > > > > Build fails > > > > > > > > > > drivers/scsi/qla2xxx/qla_iocb.c: In function > > > > > ‘qla_start_scsi_type6’: > > > > > drivers/scsi/qla2xxx/qla_iocb.c:4260:29: error: ‘struct > > > > > qla_qpair’ > > > > > has > > > > > no member named ‘dsd_inuse’ > > > > > if ((more_dsd_lists + qpair->dsd_inuse) >= NUM_DSD_CHAIN) { > > > > > ^~ > > > > > drivers/scsi/qla2xxx/qla_iocb.c:4263:32: error: ‘struct > > > > > qla_qpair’ > > > > > has > > > > > no member named ‘dsd_inuse’ > > > > > more_dsd_lists + qpair->dsd_inuse, NUM_DSD_CHAIN, > > > > > cmd); > > > > > ^~ > > > > > drivers/scsi/qla2xxx/qla_iocb.c:4267:29: error: ‘struct > > > > > qla_qpair’ > > > > > has > > > > > no member named ‘dsd_avail’ > > > > > if (more_dsd_lists <= qpair->dsd_avail) > > > > > ^~ > > > > > drivers/scsi/qla2xxx/qla_iocb.c:4270:26: error: ‘struct > > > > > qla_qpair’ > > > > > has > > > > > no member named ‘dsd_avail’ > > > > > more_dsd_lists -= qpair->dsd_avail; > > > > > ^~ > > > > > drivers/scsi/qla2xxx/qla_iocb.c:4289:41: error: ‘struct > > > > > qla_qpair’ > > > > > has > > > > > no member named ‘dsd_list’; did you mean ‘hints_list’? > > > > > list_add_tail(&dsd_ptr->list, &qpair->dsd_list); > > > > > ^~~~~~~~ > > > > > hints_list > > > > > drivers/scsi/qla2xxx/qla_iocb.c:4290:8: error: ‘struct > > > > > qla_qpair’ > > > > > has > > > > > no member named ‘dsd_avail’ > > > > > qpair->dsd_avail++; > > > > > > > > I see where this was done, there are two more patches > > > > So I will add those and test > > > > [PATCH 1/2] qla2xxx: Move resource to allow code reuse > > > > > > > > > > > > > > > OK, I was able to now build and and test this. > > > To be honest there are a lot of changes. > > > I just tested the original issue for which I had sent my first > > > patch. > > > > > > Unfortunately the test fails in a different way to the orginal > > > BUG > > > but > > > this may be firmware behavior on the MSA2050 that takes the > > > device > > > paths away. > > > This is because the test passes on an LIO target > > > > > > Against 6.5-rc4 > > > > > > Test notes > > > segstorage3 6.5.0-rc4+ > > > Note that on a MSA2050 it will hang and timeout no more panics as > > > expected. However as noted we lose devices. > > > > > > Starts out fine > > > > > > mpathbp (3600c0ff00043feab07a7ff6307000000) dm-8 HPE,MSA 2050 SAN > > > size=9.3G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw > > > > -+- policy='service-time 0' prio=50 status=active > > > > > - 1:0:1:4 sdh 8:112 active ready running > > > > > - 1:0:4:4 sdaf 65:240 active ready running > > > > > - 3:0:2:4 sdfk 130:96 active ready running > > > > `- 3:0:3:4 sdft 130:240 active ready running > > > `-+- policy='service-time 0' prio=10 status=enabled > > > |- 1:0:2:4 sdp 8:240 active ready running > > > |- 1:0:3:4 sdx 65:112 active ready running > > > |- 3:0:1:4 sdfb 129:208 active ready running > > > `- 3:0:4:4 sdge 131:160 active ready running > > > > > > Do the test > > > > > > [root@segstorage3 ~]# sg_write_same -T --lba=1 > > > /dev/mapper/mpathbp > > > Write same: transport: Host_status=0x03 [DID_TIME_OUT] > > > Driver_status=0x00 [DRIVER_OK] > > > > > > It then caused problems with two of my multipath device paths but > > > perhaps this is array firmware behavior. > > > > > > [ 820.836495] sd 3:0:2:4: [sdfk] tag#1704 FAILED Result: > > > hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=23s > > > [ 820.836497] sd 3:0:2:4: [sdfk] tag#1704 CDB: Test Unit Ready > > > 00 > > > 00 > > > 00 00 00 00 > > > > > > [ 825.926075] sd 3:0:2:0: rejecting I/O to offline device > > > [ 825.926136] sd 3:0:2:1: rejecting I/O to offline device > > > [ 825.926165] sd 3:0:2:3: rejecting I/O to offline device > > > > > > And two paths get lost > > > > > > mpathbp (3600c0ff00043feab07a7ff6307000000) dm-8 HPE,MSA 2050 SAN > > > size=9.3G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw > > > > -+- policy='service-time 0' prio=50 status=active > > > > > - 1:0:4:4 sdaf 65:240 active ready running > > > > `- 3:0:3:4 sdft 130:240 active ready running > > > `-+- policy='service-time 0' prio=10 status=enabled > > > |- 1:0:2:4 sdp 8:240 active ready running > > > |- 1:0:3:4 sdx 65:112 active ready running > > > |- 3:0:1:4 sdfb 129:208 active ready running > > > `- 3:0:4:4 sdge 131:160 active ready running > > > > > > I re-scanned but sdh and sdfk are gone from the device tree and > > > do > > > not > > > come back with a rescan. > > > > > > [ 846.043511] sd 1:0:1:4: [sdh] Synchronizing SCSI cache > > > [ 846.043536] sd 1:0:1:4: [sdh] Synchronize Cache(10) failed: > > > Result: > > > hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK > > > > > > [ 820.836495] sd 3:0:2:4: [sdfk] tag#1704 FAILED Result: > > > hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=23s > > > [ 820.836497] sd 3:0:2:4: [sdfk] tag#1704 CDB: Test Unit Ready > > > 00 > > > 00 > > > 00 00 00 00 > > > [ 851.166451] sd 3:0:2:4: [sdfk] Synchronizing SCSI cache > > > [ 851.166477] sd 3:0:2:4: [sdfk] Synchronize Cache(10) failed: > > > Result: > > > hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK > > > > > > Now using an LIO target via tcm_qla2xxx > > > > > > mpathy (3600140549ed41778d414038bf1f1bb42) dm-74 LIO-ORG,test-70 > > > size=1.0G features='0' hwhandler='1 alua' wp=rw > > > `-+- policy='service-time 0' prio=50 status=active > > > |- 1:0:5:105 sdbv 68:144 active ready running > > > |- 1:0:6:105 sdei 128:160 active ready running > > > |- 3:0:5:105 sdhz 134:144 active ready running > > > `- 3:0:6:105 sdkm 66:416 active ready running > > > > > > [root@segstorage3 block]# sg_write_same -T --lba=1 > > > /dev/mapper/mpathy > > > Write same: transport: Host_status=0x07 [DID_ERROR] > > > Driver_status=0x08 [DRIVER_SENSE] > > > > > > No issues with the LIO device paths > > > > > > [root@segstorage3 block]# multipath -ll mpathy > > > mpathy (3600140549ed41778d414038bf1f1bb42) dm-74 LIO-ORG,test-70 > > > size=1.0G features='0' hwhandler='1 alua' wp=rw > > > `-+- policy='service-time 0' prio=50 status=active > > > |- 1:0:5:105 sdbv 68:144 active ready running > > > |- 1:0:6:105 sdei 128:160 active ready running > > > |- 3:0:5:105 sdhz 134:144 active ready running > > > `- 3:0:6:105 sdkm 66:416 active ready running > > > > > > So I would like to give a positive test result but will need you > > > folks > > > at Marvell to maybe first work with HPE on the MSA2050 behavior. > > > > > > Thanks > > > Laurence > > Hello Nilesh and Marvell > > > > I ran a few more tests, this 32 bytes CDB patch set totally breaks > > the > > MSA2050 such that a full storage side reboot of the controller pair > > is > > needed to get the paths back. > > > > For that reason I am going to NACK this until you chat with HPE. > > Those MSA2050's are installed everywhere. > > > > Not pushing my original fix, but my patch would land up with the > > MSA2050 timing out but then all would function as it should after > > the > > timeout. Something about your change is not making the MSA2050 > > happy > > at > > all. > > > > Thanks > > Laurence > > > > > > > > Turns out the patch I had sent earlier also makes the MSA2050 unhappy > so I will reach out to HPE about your submission and the behavior and > get back to you. > > Again, the LIO seems to work fine with your patch set. > > This patch below also breaks the MSA, I guess I did not not see it > during my first set of tests > > > diff --git a/drivers/scsi/qla2xxx/qla_iocb.c > b/drivers/scsi/qla2xxx/qla_iocb.c > index 730d8609276c..32809e556ec6 100644 > --- a/drivers/scsi/qla2xxx/qla_iocb.c > +++ b/drivers/scsi/qla2xxx/qla_iocb.c > @@ -1386,7 +1386,8 @@ qla24xx_build_scsi_crc_2_iocbs(srb_t *sp, > struct > cmd_type_crc_2 *cmd_pkt, > if ((scsi_get_prot_op(cmd) == SCSI_PROT_READ_INSERT) || > (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_STRIP) || > (scsi_get_prot_op(cmd) == SCSI_PROT_READ_STRIP) || > - (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_INSERT)) > + (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_INSERT) || > + (scsi_get_prot_op(cmd) == SCSI_PROT_NORMAL)) > bundling = 0; > > /* Allocate CRC context from global pool */ > @@ -1448,6 +1449,9 @@ qla24xx_build_scsi_crc_2_iocbs(srb_t *sp, > struct > cmd_type_crc_2 *cmd_pkt, > dif_bytes = (data_bytes / blk_size) * 8; > > switch (scsi_get_prot_op(GET_CMD_SP(sp))) { > + case SCSI_PROT_NORMAL: > + total_bytes = data_bytes; > + break; > case SCSI_PROT_READ_INSERT: > case SCSI_PROT_WRITE_STRIP: > total_bytes = data_bytes; > Hello I think we can ack this patch. The MSA20250 behavior seems to be a firmware issue specific to that array because the LIO target passes all the tests. I did reach out to HPE and let them know but given that without this patch we panic a server, the patch is needed. Hopefully HPE follows up with me. So: Tested-by: Laurence Oberman <loberman@xxxxxxxxxx> noting that the MSA array sees an issue not seen on other arrays tested so far. Thanks