Roland, Thanks for the reply, I've actually always been a bit confused about the whole idea of applying upstream commits, target is part of the kernel, right? In this case since I'm using Fedora, how would I do that? Would I download the Fedora kernel source code, patch, and compile my own kernel? Here are all the logs I have leading up to the crash: Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000f6 Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000f7 Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000f8 Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000f9 Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000fa Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000fb Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000fc Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000fd Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000fe Jan 24 09:57:19 dracofiler kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000ff Jan 24 10:00:29 dracofiler kernel: MODE SENSE: unimplemented page/subpage: 0x1c/0x02 Jan 24 10:01:21 dracofiler kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1144976 Jan 24 10:01:21 dracofiler kernel: ABORT_TASK: ref_tag: 1144976 already complete, skipping Jan 24 10:01:21 dracofiler kernel: ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1144976 Jan 24 10:01:21 dracofiler kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1145020 Jan 24 10:01:21 dracofiler kernel: ABORT_TASK: ref_tag: 1145020 already complete, skipping Jan 24 10:01:21 dracofiler kernel: ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1145020 Jan 24 10:01:41 dracofiler kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1146120 Jan 24 10:01:41 dracofiler kernel: ABORT_TASK: ref_tag: 1146120 already complete, skipping Jan 24 10:01:41 dracofiler kernel: ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1146120 Jan 24 10:01:41 dracofiler kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1146076 Jan 24 10:01:41 dracofiler kernel: ABORT_TASK: ref_tag: 1146076 already complete, skipping Jan 24 10:01:41 dracofiler kernel: ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1146076 Jan 24 10:01:48 dracofiler kernel: Detected MISCOMPARE for addr: ffff880616856000 buf: ffff880626af5000 Jan 24 10:01:48 dracofiler kernel: Target/iblock: Send MISCOMPARE check condition and sense Jan 24 10:01:48 dracofiler kernel: Detected MISCOMPARE for addr: ffff880629bfc000 buf: ffff880626af5e00 Jan 24 10:01:48 dracofiler kernel: Target/iblock: Send MISCOMPARE check condition and sense Jan 24 10:01:56 dracofiler kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1196456 Thanks, Dan On Sun, Jan 24, 2016 at 8:11 PM, Roland Dreier <roland@xxxxxxxxxxxxxxx> wrote: >> I have tried a large number of other hosts and they all act the same >> way regardless of hardware. ESXi <6 is no problem, but 6 and newer >> crash the filer very quickly. > > You're crashing because of > > Jan 24 10:02:09 dracofiler kernel: kernel BUG at > drivers/scsi/qla2xxx/qla_target.c:3105! > > which is the BUG_ON in > > void qlt_free_cmd(struct qla_tgt_cmd *cmd) > { > struct qla_tgt_sess *sess = cmd->sess; > > ql_dbg(ql_dbg_tgt, cmd->vha, 0xe074, > "%s: se_cmd[%p] ox_id %04x\n", > __func__, &cmd->se_cmd, > be16_to_cpu(cmd->atio.u.isp24.fcp_hdr.ox_id)); > > BUG_ON(cmd->cmd_in_wq); > > It seems we're freeing a command before we process it. > > what logging do you have from target or qla2xxx before you hit the > crash? I'm wondering why the initiator is aborting commands (although > we still shouldn't crash even if it does abort commands). > > You could try applying upstream commit 193b50b9d54a ("qla2xxx: Replace > QLA_TGT_STATE_ABORTED with a bit.") which seems like it might be > related, though I'm not sure whether it really will help. > > - R. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html