Re: kernel BUG at drivers/scsi/qla2xxx/qla_target.c:3319!

Pascal de Bruijn <p.debruijn@xxxxxxxxxxx> · Mon, 26 Jun 2017 15:20:56 +0200

On 26-06-17 00:31, Nicholas A. Bellinger wrote:
Hi Himanshu & Pascal,

On Mon, 2017-06-19 at 17:53 +0000, Madhani, Himanshu wrote:
Hi Pascal, Nic,

Sorry for delay in response.

On Jun 19, 2017, at 12:20 AM, Pascal de Bruijn <p.debruijn@xxxxxxxxxxx> wrote:

Hi,

As I've illustrated in a previous thread:
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag
We've been experiencing some kernel lockups while using LIO as an FC target.

We've collected the kernel trace via serial, and this is the second lockup where we've been able to get proper kernel messages. The trace is very similar to the previous one, so the issue is practically reproducible for us, though thus far not at will. This happens once or twice a week for us, always at night:

Detected MISCOMPARE for addr: ffff91f3c55ac000 buf: ffff91f3d08dd600
Target/iblock: Send MISCOMPARE check condition and sense
ABORT_TASK: Found referenced qla2xxx task_tag: 1203652
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1203652
ABORT_TASK: Found referenced qla2xxx task_tag: 1203700
------------[ cut here ]------------
kernel BUG at drivers/scsi/qla2xxx/qla_target.c:3319!
invalid opcode: 0000 [#1] SMP
Modules linked in: tcm_qla2xxx target_core_user uio target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod configfs raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq md_mod ipmi_devintf ses enclosure sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp qla2xxx scsi_transport_fc mei_me mpt3sas igb mei ioatdma i2c_algo_bit dca wmi ipmi_si ipmi_msghandler
CPU: 13 PID: 24122 Comm: kworker/u34:0 Not tainted 4.9.30-ul5 #3
Hardware name: Supermicro SYS-1028U-E1CRTP+/X10DRU-i+, BIOS 2.0b 08/09/2016
Workqueue: tmr-iblock target_tmr_work [target_core_mod]
task: ffff91fbd4e42340 task.stack: ffffaba461684000
RIP: 0010:[<ffffffffc0625ba8>] [<ffffffffc0625ba8>] qlt_free_cmd+0x148/0x160 [qla2xxx]
RSP: 0018:ffffaba461687d30  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff91fbcbd7cd48 RCX: ffffffffc064d692
RDX: 000000000000e074 RSI: ffff91fbd34c65a0 RDI: 0000000000004000
RBP: ffffaba461687d50 R08: ffffffffc063a7d0 R09: ffff91fbcbd7cd48
R10: ffff91f3d9822bc0 R11: ffff91f3ffff9f80 R12: ffff91fbce03b680
R13: 0000000000000246 R14: 0000000000125df4 R15: ffff91fbcb5ffac0
FS:  0000000000000000(0000) GS:ffff91fbffd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9b9397ef60 CR3: 0000000366407000 CR4: 00000000001406e0
Stack:
  ffff91fb0000048b ffff91fbcbd7cd48 ffff91fbcff9c870 0000000000000246
  ffffaba461687d60 ffffffffc0a49f2f ffffaba461687d88 ffffffffc09bf417
  ffff91fbcbd7cdd0 ffff91fbcbd7cd48 ffff91eee98b0000 ffffaba461687dd8
Call Trace:
  [<ffffffffc0a49f2f>] tcm_qla2xxx_release_cmd+0xf/0x20 [tcm_qla2xxx]
  [<ffffffffc09bf417>] target_put_sess_cmd+0xb7/0x120 [target_core_mod]
  [<ffffffffc09baa8a>] core_tmr_abort_task+0x12a/0x180 [target_core_mod]
[<ffffffffc09bcfc1>] target_tmr_work+0x111/0x120 [target_core_mod]
  [<ffffffff86c8fed1>] process_one_work+0x121/0x3c0
  [<ffffffff86c901b3>] worker_thread+0x43/0x4d0
  [<ffffffff86c90170>] ? process_one_work+0x3c0/0x3c0
  [<ffffffff86c90170>] ? process_one_work+0x3c0/0x3c0
  [<ffffffff86c959a4>] kthread+0xd4/0xf0
  [<ffffffff86c958d0>] ? kthread_worker_fn+0x100/0x100
  [<ffffffff8708f922>] ret_from_fork+0x22/0x30
Code: 99 a6 c6 0f b6 83 8c 02 00 00 e9 3f ff ff ff be 05 0d 00 00 48 c7 c7 e8 6e 64 c0 e8 53 69 65 c6 48 83 c4 08 5b 41 5c 41 5d 5d c3 <0f> 0b 0f 0b 48 8b bb 90 02 00 00 e8 58 14 71 c6 e9 2c ff ff ff
RIP  [<ffffffffc0625ba8>] qlt_free_cmd+0x148/0x160 [qla2xxx]
RSP <ffffaba461687d30>
---[ end trace 82b53ac345806386 ]---

Given the trace above, it would seem this is an issue in the qla2xxx driver's target mode code, where the following bit seems particularly relevant:

linux-4.9.30/drivers/scsi/qla2xxx/qla_target.c

void qlt_free_cmd(struct qla_tgt_cmd *cmd)
{
        struct qla_tgt_sess *sess = cmd->sess;

        ql_dbg(ql_dbg_tgt, cmd->vha, 0xe074,
            "%s: se_cmd[%p] ox_id %04x\n",
            __func__, &cmd->se_cmd,
be16_to_cpu(cmd->atio.u.isp24.fcp_hdr.ox_id));

        BUG_ON(cmd->cmd_in_wq);


Can anybody confirm this is indeed an issue with the qla2xxx driver?
Is there any more information you'd need to be able to work the issue?

Regards,
Pascal de Bruijn


We are examining stack trace that was posted. However we’ll need help of Nicholas.

Nic,

This stack trace is very similar to the one we had reported last year as interaction issue between target stack and qla2xxx driver
     
you had added patches following patches to resolve this crash. (https://www.spinics.net/lists/target-devel/msg11691.html)
     
     
     * | 310d3d314be7 target: Fix race with SCF_SEND_DELAYED_TAS handling
     * | 0f4a943168f3 target: Fix remote-port TMR ABORT + se_cmd fabric stop
     * | ebde1ca5a908 target: Fix TAS handling for multi-session se_node_acls
     * | a6d9bb1c9605 target: Fix LUN_RESET active TMR descriptor handling
     * | febe562c20df target: Fix LUN_RESET active I/O handling for ACK_KREF
     
Looks like in this environment we are still hitting same issue.
     
Does this diff makes sense to add check for TRANSPORT_COMPLETE state so that qlt_free_cmd is not called from normal command path?

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 7a42bfd..35cecae 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1815,7 +1815,13 @@ void __target_execute_cmd(struct se_cmd *cmd)
ret = cmd->execute_cmd(cmd);
if (ret) {
spin_lock_irq(&cmd->t_state_lock);
+ cmd->t_state = TRANSPORT_COMPLETE;
cmd->transport_state &= ~(CMD_T_BUSY|CMD_T_SENT);
+ cmd->transport_state |= CMD_T_COMPLETE;
spin_unlock_irq(&cmd->t_state_lock);
     
transport_generic_request_failure(cmd, ret);

In the meanwhile I think we now know what's causing the circumstances 
that lets this issue surface.

We can reliably reproduce the issue when we run 'iozone -a -I' in a VM 
that's placed on the LIO provided storage and kick off a md check:

echo check > /sys/block/md0/md/sync_action

This is also probably why we were experiencing these issues consistently 
on the night between Saturday and Sunday, as an md check would be kicked 
off from cron.

Since we were experimenting with md tuning udev rules, particularly our 
sync_speed_min change may have exacerbated things:

SUBSYSTEM=="block", KERNEL=="md*", ACTION=="change", 
TEST=="md/sync_speed_min", ATTR{md/sync_speed_min}="100000"
SUBSYSTEM=="block", KERNEL=="md*", ACTION=="change", 
TEST=="md/sync_speed_max", ATTR{md/sync_speed_max}="1000000"

After staring at this, there are two scenarios during ABORT_TASK where
this scenario could occur.

Namely, there are two cases where tcm_qla2xxx.c calls
tcm_qla2xxx_free_cmd() internally during ABORT_TASK:

   - During tcm_qla2xxx_handle_data_work() when cmd->aborted is detected.
   - During tcm_qla2xxx_aborted_task() when QLA_TGT_STATE_NEW or
     QLA_TGT_STATE_DATA_IN are set.

My guess is the latter case is happening, that is resulting in
tcm_qla2xxx_free_cmd() queuing up qla_tgt_cmd->work from
transport_cmd_finish_abort() -> tcm_qla2xxx_aborted_task(), immediately
before target_put_sess_cmd() is called from core_tmr_abort_task() to
perform the final kref_put(&cmd->kref, ...).

So there is two things I'd like Pascal to try.

First, is to pick up one recent bug fix to address the possible
underflow for this scenario during ABORT_TASK:

target: Fix kref->refcount underflow in transport_cmd_finish_abort
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/target?id=73d4e580ccc5c3e05cea002f18111f66c9c07034

Note this is not directly related, but is still useful to have to avoid
the possible underflow of se_cmd->cmd_kref during this particular
scenario.  Pascal, let us know if you have any problems applying this
patch to v4.9.30.

I've moved our test setup to v4.9.34, with which the issue is still 
reproducible. The above patch applies just fine.

With only 73d4e580 applied the issue is still reproducible.

The second is the following patch to catch this scenario in progress
during qlt_free_cmd(), and clear ->cmd_in-wq + dump the state before
doing a cancel_work_sync().

diff --git a/drivers/scsi/qla2xxx/qla_target.h b/drivers/scsi/qla2xxx/qla_target.h
index d644202..65752af 100644
--- a/drivers/scsi/qla2xxx/qla_target.h
+++ b/drivers/scsi/qla2xxx/qla_target.h
@@ -876,7 +876,7 @@ struct qla_tgt_cmd {
  	unsigned int q_full:1;
  	unsigned int term_exchg:1;
  	unsigned int cmd_sent_to_fw:1;
-	unsigned int cmd_in_wq:1;
+	unsigned int cmd_in_wq;
  	unsigned int aborted:1;
  	unsigned int data_work:1;
  	unsigned int data_work_free:1;
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 75aeb9f..3ceb371 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -336,6 +336,17 @@ static void tcm_qla2xxx_release_cmd(struct se_cmd *se_cmd)
  	}
  
  	cmd = container_of(se_cmd, struct qla_tgt_cmd, se_cmd);
+
+	if (cmpxchg(&cmd->cmd_in_wq, 1, 0) == 1) {
+		printk("Canceling outstanding qla_tgt_cmd->cmd_in_wq = 1"
+			" state: %d data_work: %d data_work_free: %d trc_flags: 0x%08x"
+			" se_cmd->se_cmd_flags: 0x%08x se_cmd->transport_state: 0x%08x\n"
+			" se_cmd->t_task_cdb: 0x%02x\n", cmd->state, cmd->data_work,
+			cmd->data_work_free, cmd->trc_flags, cmd->se_cmd.se_cmd_flags,
+			cmd->se_cmd.transport_state, cmd->se_cmd.t_task_cdb[0]);
+		cancel_work_sync(&cmd->work);
+	}
+
  	qlt_free_cmd(cmd);
  }

This patch does apply cleanly to v4.9.34, however it did not compile, as 
data_work, data_work_free and trc_flags are not yet present. After 
removing those references it did compile fine.

Here's the relevant part of dmesg:

[  103.740391] md: data-check of RAID array md0
[  103.740395] md: minimum _guaranteed_  speed: 100000 KB/sec/disk.
[  103.740397] md: using maximum available idle IO bandwidth (but not 
more than 1000000 KB/sec) for data-check.
[  103.740416] md: using 128k window, over a total of 405798912k.
[  435.673247] ABORT_TASK: Found referenced qla2xxx task_tag: 1222180
[  435.927330] Canceling outstanding qla_tgt_cmd->cmd_in_wq = 1 state: 2 
se_cmd->se_cmd_flags: 0x01c00109 se_cmd->transport_state: 
0x00000091se_cmd->t_task_cdb: 0x2a
[  435.927351] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 
1222180
[  565.785280] qla2xxx/21:00:00:24:ff:08:c4:8a: Unsupported SCSI Opcode 
0x85, sending CHECK_CONDITION.
[  702.818402] ABORT_TASK: Found referenced qla2xxx task_tag: 1230964
[  703.090441] Canceling outstanding qla_tgt_cmd->cmd_in_wq = 1 state: 2 
se_cmd->se_cmd_flags: 0x01c00109 se_cmd->transport_state: 
0x00000091se_cmd->t_task_cdb: 0x2a
[  703.090457] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 
1230964
[  888.912452] ABORT_TASK: Found referenced qla2xxx task_tag: 1215316
[  889.133709] Canceling outstanding qla_tgt_cmd->cmd_in_wq = 1 state: 2 
se_cmd->se_cmd_flags: 0x01c00109 se_cmd->transport_state: 
0x00000091se_cmd->t_task_cdb: 0x2a
[  889.133713] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 
1215316
[ 1741.356248] ABORT_TASK: Found referenced qla2xxx task_tag: 1222900
[ 1741.609044] Canceling outstanding qla_tgt_cmd->cmd_in_wq = 1 state: 2 
se_cmd->se_cmd_flags: 0x01c00109 se_cmd->transport_state: 
0x00000091se_cmd->t_task_cdb: 0x2a
[ 1741.609047] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 
1222900
[ 2365.885579] qla2xxx/21:00:00:24:ff:01:fa:9d: Unsupported SCSI Opcode 
0x85, sending CHECK_CONDITION.
[ 2425.729887] ABORT_TASK: Found referenced qla2xxx task_tag: 1242772
[ 2425.901292] Canceling outstanding qla_tgt_cmd->cmd_in_wq = 1 state: 2 
se_cmd->se_cmd_flags: 0x01c00109 se_cmd->transport_state: 
0x00000091se_cmd->t_task_cdb: 0x2a
[ 2425.901296] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 
1242772
[ 2778.797321] md: md0: data-check done.
[ 4165.989290] qla2xxx/21:00:00:24:ff:01:fa:9d: Unsupported SCSI Opcode 
0x85, sending CHECK_CONDITION.

Presumably every "Cancelling..." would have been a lockup?

So I'd like Pascal to get this output before making further changes to
confirm what's really what's going on..

Beyond this patch, I'm starting to think the two cases above where
tcm_qla2xxx is calling tcm_qla2xxx_free_cmd() during TMR ABORT, are in
fact incorrect.

They are incorrect because in both cases target_core_tmr.c callers and
transport_cmd_finish_abort() are responsible for dropping any remaining
se_cmd->cmd_kref references, and releasing all se_cmd resources.

Having tcm_qla2xxx internal calls to tcm_qla2xxx_free_cmd() while TMR
CMD_T_ABORTED code has already quiesced se_cmd, and is in the process of
dropping se_cmd->cmd_kref references doesn't make sense to me.

That said, let's have Pascal verify with the two above patches, and if
we confirm it's scenario in question, then I'd like to consider dropping
the two cases in tcm_qla2xxx that invoke tcm_qla2xxx_free_cmd() during
CMD_T_ABORTED.

Please let me know if you need me to try anything else.

Regards,
Pascal de Bruijn


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html