Hi Bryant,
(Re-adding the target-devel CC')
On Mon, 2017-07-31 at 16:06 -0500, Bryant G. Ly wrote:
From: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
Hi folks,
This series contains target-core queue-full + iscsi-target
callback fixes for a bug reported by Steve and Potnuri
during recent v4.8.y iWARP/iser-target testing.
The first patch fixes target-core queue-full handling response
leaks with non -EAGAIN / -ENOMEM errors. It uses a new state
TRANSPORT_COMPLETE_QF_ERR to internally generate CHECK_CONDITION
for unknown fabric callback errors, to avoid attempting retry
of fabric data-transfer callbacks for this special case.
This means all non -EAGAIN / -ENOMEM fabric callback errors
during target_core_fabric_ops for:
*) ->write_pending()
*) ->queue_data_in()
*) ->queue_status()
will result in CHECK_CONDITION + LOGICAL_UNIT_COMMUNICATION_FAILURE,
if no non-zero se_cmd->scsi_status was previously set.
It also means target-core ->queue_status() errors retry indefinately,
or until session shutdown explicitly stops outstanding I/O.
The remaining changes are for propagating iscsit_transport
response failure back to target-core queue-full, and updating
iser-target to propagate isert_rdma_rw_ctx_post() errors from
RDMA R/W API back to target-core as well.
Please review.
--nab
Nicholas Bellinger (3):
target: Fix unknown fabric callback queue-full errors
iscsi-target: Propigate queue_data_in + queue_status errors
iser-target: Fix queue-full response handling
drivers/infiniband/ulp/isert/ib_isert.c | 53 +++++++++-----
drivers/target/iscsi/iscsi_target.c | 3 +-
drivers/target/iscsi/iscsi_target_configfs.c | 13 ++--
drivers/target/iscsi/iscsi_target_util.c | 5 +-
drivers/target/iscsi/iscsi_target_util.h | 2 +-
drivers/target/target_core_transport.c | 102
++++++++++++++++++---------
include/target/target_core_base.h | 1 +
7 files changed, 114 insertions(+), 65 deletions(-)
Hi Nick,
I was sent this trace and to me it looks like our driver isn't
getting a response back from target-core when we return -EIO
on ibmvscsis_write_pending from a failed srp_transfer_data. This
then locks us up on a wait_for_completion since our
active_queue/scheduled_queues aren't empty. So we keep waiting, do
you think there is still a leak for outgoing responses
when our fabric driver does not return a -EAGAIN or -ENOMEM?
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.123071]
dev[c0000001fe6be000]: Backstore name
'830530a9-2d65-317c-bea0-9ebc82739b64' is too long for
INQUIRY_MODEL, truncating to 16 bytes
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592683] ibmvscsis
30000006: rdma: h_copy_rdma failed, rc -11
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592699] ibmvscsis:
srp_transfer_data() failed: -11
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592705] Got unknown
fabric queue status: -5
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592713] ibmvscsis
30000006: build_response: error copying to client, rc -11, flags
0x140, state 0x10
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592724] ibmvscsis:
write_pending failed since: 320
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592731] ibmvscsis:
write_pending failed since: 320
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592738] ibmvscsis:
write_pending failed since: 320
Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592745] ibmvscsis:
write_pending failed since: 320
Jul 20 16:53:35 tuk6r1phn1 kernel: [494628.620264] ibmvscsis:
connection lost with outstanding work
Jul 20 16:53:35 tuk6r1phn1 kernel: [494628.666964] HVCS: Closed
vty-server@30000008 and partner vty@30000000:2 connection.
<SNIP>
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428287] INFO: task
kworker/1:4:6743 blocked for more than 120 seconds.
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428296] Tainted:
G OE 4.8.0-58-generic #63~16.04.1-Ubuntu
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428298] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428301] kworker/1:4 D
0000000000000000 0 6743 2 0x00000800
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428310] Workqueue:
ibmvscsis30000006 ibmvscsis_disconnect [ibmvscsis]
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428310] Call Trace:
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428313]
[c0000001ae0c36f0] [c0000000014ef090]
sysctl_sched_migration_cost+0x0/0x4 (unreliable)
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428316]
[c0000001ae0c38c0] [c00000000001a610] __switch_to+0x280/0x3b0
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428319]
[c0000001ae0c3920] [c000000000b4b6e4] __schedule+0x2f4/0x990
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428321]
[c0000001ae0c3a00] [c000000000b4bdcc] schedule+0x4c/0xc0
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428322]
[c0000001ae0c3a30] [c000000000b50404] schedule_timeout+0x284/0x480
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428324]
[c0000001ae0c3b20] [c000000000b4cd9c] wait_for_common+0xec/0x240
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428326]
[c0000001ae0c3ba0] [d000000008f66540]
ibmvscsis_disconnect+0x240/0x9d0 [ibmvscsis]
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428329]
[c0000001ae0c3c50] [c0000000000f1c28] process_one_work+0x1e8/0x5b0
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428330]
[c0000001ae0c3ce0] [c0000000000f2098] worker_thread+0xa8/0x650
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428332]
[c0000001ae0c3d80] [c0000000000fa864] kthread+0x114/0x140
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428334]
[c0000001ae0c3e30] [c0000000000098f0] ret_from_kernel_thread+0x5c/0x6c
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428338] INFO: task
targetcli:9627 blocked for more than 120 seconds.
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428341] Tainted:
G OE 4.8.0-58-generic #63~16.04.1-Ubuntu
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428343] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428345] targetcli D
00003fff9bf3fd24 0 9627 9626 0x00040000
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428347] Call Trace:
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428349]
[c0000001ad647700] [c0000000014ef090]
sysctl_sched_migration_cost+0x0/0x4 (unreliable)
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428351]
[c0000001ad6478d0] [c00000000001a610] __switch_to+0x280/0x3b0
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428352]
[c0000001ad647930] [c000000000b4b6e4] __schedule+0x2f4/0x990
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428354]
[c0000001ad647a10] [c000000000b4bdcc] schedule+0x4c/0xc0
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428355]
[c0000001ad647a40] [c000000000b50404] schedule_timeout+0x284/0x480
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428357]
[c0000001ad647b30] [c000000000b4cd9c] wait_for_common+0xec/0x240
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428366]
[c0000001ad647bb0] [d0000000079a4d60]
transport_clear_lun_ref+0x50/0x70 [target_core_mod]
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428373]
[c0000001ad647be0] [d00000000799d0e8] core_tpg_remove_lun+0x58/0x160
[target_core_mod]
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428379]
[c0000001ad647c20] [d0000000079895fc] core_dev_del_lun+0x4c/0x150
[target_core_mod]
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428385]
[c0000001ad647cb0] [d00000000798b590]
target_fabric_port_unlink+0x70/0x90 [target_core_mod]
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428388]
[c0000001ad647ce0] [d000000004fd5bb8] configfs_unlink+0x188/0x2c0
[configfs]
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428390]
[c0000001ad647d40] [c000000000338774] vfs_unlink+0x104/0x2a0
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428392]
[c0000001ad647d90] [c00000000033fe08] do_unlinkat+0x378/0x3b0
Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428393]
[c0000001ad647e30] [c000000000009584] system_call+0x38/0xec
Jul 20 16:58:33 tuk6r1phn1 kernel: [494927.259770] INFO: task
kworker/1:4:6743 blocked for more than 120 seconds.
On v4.10.y and earlier code without commit fa7e25cf13, returning
anything other than -EAGAIN or -ENOMEM from fabric code via
TFO->write_pending() will result in a lost response and subsequent hung
tasks..
In ibmvscsi's case, I see -EAGAIN was changed to -EIO in commit
0edcc47cdb7 to avoid the endless loop on v4.9.y code, when the
srp_transfer_data() would fail repeatably.
That said, with commit fa7e25cf13 in place, anything other than -EAGAIN
or -ENOMEM from TFO->write_pending() will reach the following code in
transport_handle_queue_full():
/*
* -EAGAIN or -ENOMEM signals retry of ->write_pending() and/or
* ->queue_data_in() callbacks from new process context.
*
* Otherwise for other errors, transport_complete_qf() will
send
* CHECK_CONDITION via ->queue_status() instead of
attempting to
* retry associated fabric driver data-transfer callbacks.
*/
if (err == -EAGAIN || err == -ENOMEM) {
cmd->t_state = (write_pending) ?
TRANSPORT_COMPLETE_QF_WP :
TRANSPORT_COMPLETE_QF_OK;
} else {
pr_warn_ratelimited("Got unknown fabric queue
status: %d\n", err);
cmd->t_state = TRANSPORT_COMPLETE_QF_ERR;
}
which sets TRANSPORT_COMPLETE_QF_ERR, and invokes the following in
transport_complete_qf() to immediately complete se_cmd with any fabric
driver provided ->scsi_status, or return CHECK_CONDITION with logical
unit communication failure sense data if no ->scsi_status is set:
transport_complete_task_attr(cmd);
/*
* If a fabric driver ->write_pending() or ->queue_data_in()
callback
* has returned neither -ENOMEM or -EAGAIN, assume it's
fatal and
* the same callbacks should not be retried. Return
CHECK_CONDITION
* if a scsi_status is not already set.
*
* If a fabric driver ->queue_status() has returned non
zero, always
* keep retrying no matter what..
*/
if (cmd->t_state == TRANSPORT_COMPLETE_QF_ERR) {
if (cmd->scsi_status)
goto queue_status;
cmd->se_cmd_flags |= SCF_EMULATED_TASK_SENSE;
cmd->scsi_status = SAM_STAT_CHECK_CONDITION;
cmd->scsi_sense_length = TRANSPORT_SENSE_BUFFER;
translate_sense_reason(cmd,
TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE);
goto queue_status;
}
So for ibmvscsi on v4.8.y code, I'd backport commit fa7e25cf13 to your
v4.8.y tree, and if/when a specific srp_transfer_data() failure is
retriable, return -EAGAIN. Or if the srp_transport_data() failure is
not retriable and se_cmd should be completed immediately with exception
status, return -EIO.
Btw, I'd been thinking about including commit fa7e25cf13 for stable
backports for the iser-target consumers that need it. If it addresses
this case ibmvscsi as well on <= v4.10.y code, it's even more a reason
to add it into stable.