On 21-07-17 15:07, Pascal de Bruijn wrote:
On 11-07-17 19:52, Nicholas A. Bellinger wrote:
Hi Pascal,
On Mon, 2017-07-10 at 20:52 -0700, Nicholas A. Bellinger wrote:
On Mon, 2017-07-10 at 20:41 -0700, Nicholas A. Bellinger wrote:
On Mon, 2017-07-10 at 17:04 +0200, Pascal de Bruijn wrote:
On 07-07-17 06:26, Nicholas A. Bellinger wrote:
On Mon, 2017-07-03 at 16:03 +0200, Pascal de Bruijn wrote:
So abort_task can still be observed, but they do not result in a
non-functional not-quite-PANICked machine anymore.
Thank alot for the bug report and your continuous testing to get
this
resolved. The patch is queued up in target-pending/for-next with
your
Tested-by, and will be CC' to stable so the older v4.x.y and v3.x.y
kernels get this fix as well.
Thanks again.
I'm afraid we may not be quite there yet after all...
So we had the other two machines run an md check this weekend
as well, again with a rediculously high synx_speed_max:
Jul 9 04:00:01 myhost kernel: [661309.794774] md: data-check of
RAID array md0
Jul 9 04:00:01 myhost kernel: [661309.799173] md: minimum
_guaranteed_ speed: 10000 KB/sec/disk.
Jul 9 04:00:01 myhost kernel: [661309.805219] md: using maximum
available idle IO bandwidth (but not more than 1000000 KB/sec) for
data-check.
Jul 9 04:00:01 myhost kernel: [661309.815194] md: using 128k
window, over a total of 3252682752k.
Jul 9 04:00:42 myhost kernel: [661351.076391]
qla2xxx/21:00:00:24:ff:4b:8f:58: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 04:02:01 myhost kernel: [661429.985082]
qla2xxx/21:00:00:24:ff:4b:9e:19: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 04:04:24 myhost kernel: [661573.395245]
qla2xxx/50:01:43:80:28:ca:86:36: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 04:04:57 myhost kernel: [661605.837694]
qla2xxx/50:01:43:80:28:ca:86:e6: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 04:09:19 myhost kernel: [661868.261211]
qla2xxx/21:00:00:24:ff:54:9e:ab: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 04:13:17 myhost kernel: [662105.788459] ABORT_TASK: Found
referenced qla2xxx task_tag: 1175332
Jul 9 04:13:17 myhost kernel: [662105.794794] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 1175332
Jul 9 04:13:17 myhost kernel: [662105.990584] ABORT_TASK: Found
referenced qla2xxx task_tag: 1175380
Jul 9 04:13:18 myhost kernel: [662106.510403] ABORT_TASK: Sending
TMR_FUNCTION_COMPLETE for ref_tag: 1175380
Jul 9 04:13:20 myhost kernel: [662108.988526] ABORT_TASK: Found
referenced qla2xxx task_tag: 1175620
Jul 9 04:13:31 myhost kernel: [662119.684969] ABORT_TASK: Found
referenced qla2xxx task_tag: 1211140
This means the outstanding I/O was located, but never completed back.
Or, there is something else blocked waiting for completion like
outstanding qla2xxx WRITE transfer, before the abort may proceed.
Jul 9 04:16:42 myhost kernel: [662310.617910]
qla2xxx/21:00:00:24:ff:92:bf:43: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 04:18:00 myhost kernel: [662389.415853]
qla2xxx/21:00:00:24:ff:54:a1:33: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 04:18:22 myhost kernel: [662411.066461]
qla2xxx/21:00:00:24:ff:92:bf:59: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 04:20:23 myhost kernel: [662531.852833]
qla2xxx/21:00:00:24:ff:3c:d0:94: Unsupported SCSI Opcode 0x85,
sending CHECK_CONDITION.
Jul 9 07:00:28 myhost kernel: [672137.325166] md: md0: data-check
done.
Btw, where there any hung task warnings while this was happening..?
The machine in question was still responsive (was accepting
SSH logins), however it seemed VMware hosts weren't seeing
the volume anymore (presumably due to the heavy IO on the backend).
Also, several (12?) kworkers seemed stuck in a D state.
Next time that happens, do a 'cat /proc/$PID/stack' to see where those
kworkers are blocked in un-interruptible sleep.
When my collegue tried to reboot the machine it got
(presumably) stuck on
/usr/bin/targetctl clear
After which it was forcefully rebooted :)
Sorry we don't have any more detailed info at this point.
We haven't been able to reproduce this on a
different machine yet :(
Ok, please try with the following debug patch to verify if it's
blocked
on qla_tgt_cmd->write_pending_abort_comp.
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 8c1bf9b..1199969 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -425,6 +425,7 @@ static int
tcm_qla2xxx_write_pending_status(struct se_cmd *se_cmd)
if (se_cmd->t_state == TRANSPORT_WRITE_PENDING ||
se_cmd->t_state == TRANSPORT_COMPLETE_QF_WP) {
spin_unlock_irqrestore(&se_cmd->t_state_lock, flags);
+ printk("Completing write_pending_abort_comp for tag:
%llu\n", se_cmd->tag);
wait_for_completion(&cmd->write_pending_abort_comp);
return 0;
}
@@ -515,7 +516,11 @@ static void
tcm_qla2xxx_handle_data_work(struct work_struct *work)
* waiting upon completion in
tcm_qla2xxx_write_pending_status()
*/
if (cmd->se_cmd.transport_state & CMD_T_ABORTED) {
+ u64 tag = cmd->se_cmd.tag;
+
+ printk("Blocked on write_pending_abort_comp for tag:
%llu\n", tag);
complete(&cmd->write_pending_abort_comp);
+ printk("Awoke write_pending_abort_comp for tag:
%llu\n", tag);
return;
}
Whoops, mixed up the printk() output above..
Please use this one instead:
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 8c1bf9b..01ffb8a 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -417,6 +417,8 @@ static int
tcm_qla2xxx_write_pending_status(struct se_cmd *se_cmd)
{
struct qla_tgt_cmd *cmd = container_of(se_cmd, struct
qla_tgt_cmd, se_cmd);
unsigned long flags;
+ u64 tag = se_cmd->tag;
+
/*
* Check for WRITE_PENDING status to determine if we need to
wait for
* CTIO aborts to be posted via hardware in
tcm_qla2xxx_handle_data().
@@ -425,7 +427,9 @@ static int
tcm_qla2xxx_write_pending_status(struct se_cmd *se_cmd)
if (se_cmd->t_state == TRANSPORT_WRITE_PENDING ||
se_cmd->t_state == TRANSPORT_COMPLETE_QF_WP) {
spin_unlock_irqrestore(&se_cmd->t_state_lock, flags);
+ printk("Blocked on write_pending_abort_comp for tag:
%llu\n", tag);
wait_for_completion(&cmd->write_pending_abort_comp);
+ printk("Awoke write_pending_abort_comp for tag: %llu\n", tag);
return 0;
}
spin_unlock_irqrestore(&se_cmd->t_state_lock, flags);
@@ -508,6 +512,11 @@ static void tcm_qla2xxx_handle_data_work(struct
work_struct *work)
*/
cmd->cmd_in_wq = 0;
+ if (cmd->se_cmd.transport_state & CMD_T_ABORTED) {
+ printk("Detected CMD_T_ABORTED from handle_data_work tag:
%llu %u\n",
+ cmd->se_cmd.tag, cmd->write_data_transferred);
+ }
+
cmd->vha->tgt_counters.qla_core_ret_ctio++;
if (!cmd->write_data_transferred) {
/*
@@ -515,6 +524,8 @@ static void tcm_qla2xxx_handle_data_work(struct
work_struct *work)
* waiting upon completion in
tcm_qla2xxx_write_pending_status()
*/
if (cmd->se_cmd.transport_state & CMD_T_ABORTED) {
+ printk("Completing write_pending_abort_comp for tag:
%llu\n",
+ cmd->se_cmd.tag);
complete(&cmd->write_pending_abort_comp);
return;
}
We still haven't been able to reliably reproduce, however we did come
across this, that's very likely related.
Essentially we're running 'echo check > /sys/block/md0/md/sync_action'
throughout the night, with no practical sync_speed_max limit.
Only in the morning presumably when I was doing stuff in VMware (I
don't particularly recall what), VMware started reporting the volume
as inaccessible.
Here's the full dmesg, to provide maximum context (sorry for the long
post):
... SNIP ...
[Fri Jul 21 10:56:23 2017] INFO: task kworker/6:2:788 blocked for more
than 180 seconds.
[Fri Jul 21 10:56:23 2017] Not tainted 4.9.38-ul5 #3
[Fri Jul 21 10:56:23 2017] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jul 21 10:56:23 2017] kworker/6:2 D 0 788 2 0x00000000
[Fri Jul 21 10:56:23 2017] Workqueue: events qlt_free_session_done
[qla2xxx]
[Fri Jul 21 10:56:23 2017] 0000000000000000 ffff9ebcd4c38000
ffff9ebcc9ce8000 ffff9ebcffc95080
[Fri Jul 21 10:56:23 2017] ffff9ebcdc5c9780 ffffb11c24e67b28
ffffffffa8092196 ffffffffa7ca4a37
[Fri Jul 21 10:56:23 2017] ffff9ebcc9ce8000 7fffffffffffffff
0000000000000000 ffff9ebcc9ce8000
[Fri Jul 21 10:56:23 2017] Call Trace:
[Fri Jul 21 10:56:23 2017] [<ffffffffa8092196>] ? __schedule+0x166/0x530
[Fri Jul 21 10:56:23 2017] [<ffffffffa7ca4a37>] ?
__enqueue_entity+0x67/0x70
[Fri Jul 21 10:56:23 2017] [<ffffffffa8092591>] schedule+0x31/0x80
[Fri Jul 21 10:56:23 2017] [<ffffffffa8094ed7>]
schedule_timeout+0x147/0x190
[Fri Jul 21 10:56:23 2017] [<ffffffffa7ca3faf>] ?
sched_clock_cpu+0x8f/0xa0
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c9f06f>] ?
check_preempt_curr+0x4f/0x80
[Fri Jul 21 10:56:23 2017] [<ffffffffa8092f10>]
wait_for_common+0xb0/0x170
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c9fdb0>] ? wake_up_q+0x70/0x70
[Fri Jul 21 10:56:23 2017] [<ffffffffa8092fe8>]
wait_for_completion+0x18/0x20
[Fri Jul 21 10:56:23 2017] [<ffffffffc02b3901>]
target_wait_for_sess_cmds+0x41/0x110 [target_core_mod]
[Fri Jul 21 10:56:23 2017] [<ffffffffc033afb0>]
tcm_qla2xxx_free_session+0x40/0x80 [tcm_qla2xxx]
[Fri Jul 21 10:56:23 2017] [<ffffffffc048d44d>]
qlt_free_session_done+0x2cd/0x3d0 [qla2xxx]
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c92323>] ? insert_work+0x53/0xc0
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c93111>]
process_one_work+0x121/0x3c0
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c933f3>] worker_thread+0x43/0x4d0
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c933b0>] ?
process_one_work+0x3c0/0x3c0
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c933b0>] ?
process_one_work+0x3c0/0x3c0
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c98be4>] kthread+0xd4/0xf0
[Fri Jul 21 10:56:23 2017] [<ffffffffa7c98b10>] ?
kthread_worker_fn+0x100/0x100
[Fri Jul 21 10:56:23 2017] [<ffffffffa8096022>] ret_from_fork+0x22/0x30
I hope this helps, let me know if you need anything from our end.
I forgot to clarify, so this was with 4.9.38, with the following patches
applied:
1eb42f965cedafb700e9c902ddafb1c51e3117f7.patch
4db6a8145940d0bbd10265020d681961ce2d3238.patch
linux-4.9.38-target-filter-0x85-scsi-opcde-warning.patch (see below)
nab-debug.patch (the patch your asked me to apply)
Regards,
Pascal de Bruijn
diff -Nurpd a/drivers/target/target_core_transport.c
b/drivers/target/target_core_transport.c
--- a/drivers/target/target_core_transport.c 2017-07-15
12:17:55.000000000 +0200
+++ b/drivers/target/target_core_transport.c 2017-07-20
09:51:21.494390926 +0200
@@ -1337,7 +1337,7 @@ target_setup_cmd_from_cdb(struct se_cmd
trace_target_sequencer_start(cmd);
ret = dev->transport->parse_cdb(cmd);
- if (ret == TCM_UNSUPPORTED_SCSI_OPCODE)
+ if (ret == TCM_UNSUPPORTED_SCSI_OPCODE && cmd->t_task_cdb[0] !=
0x85) /* filter VMware ATA PASS THROUGH for SMART */
pr_warn_ratelimited("%s/%s: Unsupported SCSI Opcode 0x%02x,
sending CHECK_CONDITION.\n",
cmd->se_tfo->get_fabric_name(),
cmd->se_sess->se_node_acl->initiatorname,
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html