Nicholas, At this point we can forgo this patch to stable tree. There are additional tweaks in this path during testing. We will update the stable tree with new findings. Thanks. Regards, Quinn Tran On 5/22/14 8:00 PM, "Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> wrote: >Hi Saurav + Quinn, > >Just curious if this fix needs to be CC'ed to stable as well, or if it's >something that is only triggered with the preceding T10 DIF support >patch in place..? > >--nab > >On Fri, 2014-04-11 at 16:54 -0400, Saurav Kashyap wrote: >> From: Quinn Tran <quinn.tran@xxxxxxxxxx> >> >> Fix double free problem within qla2xxx driver where >> current code prematurely free qla_tgt_cmd while firmware >> still has the command. When firmware release the command >> after abort, the code attempt a second free as part of >> command completion processing. >> >> When TCM start the free process, NULL pointer was hit. >> >> ------ >> WARNING: CPU: 8 PID: 43613 at lib/list_debug.c:62 >>__list_del_entry+0x82/0xd0() >> list_del corruption. next->prev should be ffff88082b5cfb08, but was >>6b6b6b6b6b6b6b6b >> CPU: 8 PID: 43613 Comm: kworker/8:0 Tainted: GF W O >>3.13.0-rc3-nab_t10dif+ #6 >> Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/20/2012 >> Workqueue: events cache_reap >> 000000000000003e ffff88081b2e3c78 ffffffff815a051f 000000000000003e >> ffff88081b2e3cc8 ffff88081b2e3cb8 ffffffff8104fc2c 0000000000000000 >> ffff88082b5cfb00 ffff88081c788d00 ffff88082b5d7200 ffff88082b5d3080 >> Call Trace: >> [<ffffffff815a051f>] dump_stack+0x49/0x62 >> [<ffffffff8104fc2c>] warn_slowpath_common+0x8c/0xc0 >> [<ffffffff8104fd16>] warn_slowpath_fmt+0x46/0x50 >> [<ffffffff812b6592>] __list_del_entry+0x82/0xd0 >> [<ffffffff8106d48c>] process_one_work+0x12c/0x510 >> [<ffffffff8106d4d3>] ? process_one_work+0x173/0x510 >> [<ffffffff8106ebdf>] worker_thread+0x11f/0x3a0 >> [<ffffffff8106eac0>] ? manage_workers+0x170/0x170 >> [<ffffffff81074f26>] kthread+0xf6/0x120 >> [<ffffffff8109f103>] ? __lock_release+0x133/0x1b0 >> [<ffffffff81074e30>] ? __init_kthread_worker+0x70/0x70 >> [<ffffffff815aec2c>] ret_from_fork+0x7c/0xb0 >> [<ffffffff81074e30>] ? __init_kthread_worker+0x70/0x70 >> ---[ end trace dfc05c3f7caf8ebe ]--- >> BUG: unable to handle kernel NULL pointer dereference at >>0000000000000008 >> IP: [<ffffffff8106d391>] process_one_work+0x31/0x510 >> ------- >> >> Signed-off-by: Quinn Tran <quinn.tran@xxxxxxxxxx> >> Signed-off-by: Giridhar Malavali <giridhar.malavali@xxxxxxxxxx> >> Signed-off-by: Saurav Kashyap <saurav.kashyap@xxxxxxxxxx> >> --- >> drivers/scsi/qla2xxx/qla_target.c | 29 ++++++++++++++++++++++++++--- >> 1 files changed, 26 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/scsi/qla2xxx/qla_target.c >>b/drivers/scsi/qla2xxx/qla_target.c >> index f24d44c..b1d10f9 100644 >> --- a/drivers/scsi/qla2xxx/qla_target.c >> +++ b/drivers/scsi/qla2xxx/qla_target.c >> @@ -2681,7 +2681,18 @@ static void qlt_send_term_exchange(struct >>scsi_qla_host *vha, >> rc = __qlt_send_term_exchange(vha, cmd, atio); >> spin_unlock_irqrestore(&vha->hw->hardware_lock, flags); >> done: >> - if (rc == 1) { >> + /* >> + * Terminate exchange will tell fw to release any active CTIO >> + * that's in FW posession and cleanup the exchange. >> + * >> + * "cmd->state == QLA_TGT_STATE_ABORTED" means CTIO is still >> + * down at FW. Free the cmd later when CTIO comes back later >> + * w/aborted(0x2) status. >> + * >> + * "cmd->state != QLA_TGT_STATE_ABORTED" means CTIO is already >> + * back w/some err. Free the cmd now. >> + */ >> + if ((rc == 1) && (cmd->state != QLA_TGT_STATE_ABORTED)) { >> if (!ha_locked && !in_interrupt()) >> msleep(250); /* just in case */ >> >> @@ -2689,6 +2700,7 @@ done: >> qlt_unmap_sg(vha, cmd); >> vha->hw->tgt.tgt_ops->free_cmd(cmd); >> } >> + return; >> } >> >> void qlt_free_cmd(struct qla_tgt_cmd *cmd) >> @@ -2906,6 +2918,7 @@ static void qlt_do_ctio_completion(struct >>scsi_qla_host *vha, uint32_t handle, >> case CTIO_LIP_RESET: >> case CTIO_TARGET_RESET: >> case CTIO_ABORTED: >> + /* driver request abort via Terminate exchange */ >> case CTIO_TIMEOUT: >> case CTIO_INVALID_RX_ID: >> /* They are OK */ >> @@ -2974,9 +2987,18 @@ static void qlt_do_ctio_completion(struct >>scsi_qla_host *vha, uint32_t handle, >> break; >> } >> >> - if (cmd->state != QLA_TGT_STATE_NEED_DATA) >> + >> + /* "cmd->state == QLA_TGT_STATE_ABORTED" means >> + * cmd is already aborted/terminated, we don't >> + * need to terminate again. The exchange is already >> + * cleaned up/freed at FW level. Just cleanup at driver >> + * level. >> + */ >> + if ((cmd->state != QLA_TGT_STATE_NEED_DATA) && >> + (cmd->state != QLA_TGT_STATE_ABORTED)) { >> if (qlt_term_ctio_exchange(vha, ctio, cmd, status)) >> return; >> + } >> } >> skip_term: >> >> @@ -3007,7 +3029,8 @@ skip_term: >> "not return a CTIO complete\n", vha->vp_idx, cmd->state); >> } >> >> - if (unlikely(status != CTIO_SUCCESS)) { >> + if (unlikely(status != CTIO_SUCCESS) && >> + (cmd->state != QLA_TGT_STATE_ABORTED)) { >> ql_dbg(ql_dbg_tgt_mgt, vha, 0xf01f, "Finishing failed CTIO\n"); >> dump_stack(); >> } > > >-- >To unsubscribe from this list: send the line "unsubscribe linux-scsi" in >the body of a message to majordomo@xxxxxxxxxxxxxxx >More majordomo info at http://vger.kernel.org/majordomo-info.html ________________________________ This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
<<attachment: winmail.dat>>