Re: [PATCH 0/3] target: Fix queue-full callback error signaling

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Sun, 06 Aug 2017 21:17:52 -0700

Hi Bryant,

(Re-adding the target-devel CC')

On Mon, 2017-07-31 at 16:06 -0500, Bryant G. Ly wrote:
> > From: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
> >
> > Hi folks,
> >
> > This series contains target-core queue-full + iscsi-target
> > callback fixes for a bug reported by Steve and Potnuri
> > during recent v4.8.y iWARP/iser-target testing.
> >
> > The first patch fixes target-core queue-full handling response
> > leaks with non -EAGAIN / -ENOMEM errors.  It uses a new state
> > TRANSPORT_COMPLETE_QF_ERR to internally generate CHECK_CONDITION
> > for unknown fabric callback errors, to avoid attempting retry
> > of fabric data-transfer callbacks for this special case.
> >
> > This means all non -EAGAIN / -ENOMEM fabric callback errors
> > during target_core_fabric_ops for:
> >
> >    *) ->write_pending()
> >    *) ->queue_data_in()
> >    *) ->queue_status()
> >
> > will result in CHECK_CONDITION + LOGICAL_UNIT_COMMUNICATION_FAILURE,
> > if no non-zero se_cmd->scsi_status was previously set.
> > It also means target-core ->queue_status() errors retry indefinately,
> > or until session shutdown explicitly stops outstanding I/O.
> >
> > The remaining changes are for propagating iscsit_transport
> > response failure back to target-core queue-full, and updating
> > iser-target to propagate isert_rdma_rw_ctx_post() errors from
> > RDMA R/W API back to target-core as well.
> >
> > Please review.
> >
> > --nab
> >
> > Nicholas Bellinger (3):
> >    target: Fix unknown fabric callback queue-full errors
> >    iscsi-target: Propigate queue_data_in + queue_status errors
> >    iser-target: Fix queue-full response handling
> >
> >   drivers/infiniband/ulp/isert/ib_isert.c      |  53 +++++++++-----
> >   drivers/target/iscsi/iscsi_target.c          |   3 +-
> >   drivers/target/iscsi/iscsi_target_configfs.c |  13 ++--
> >   drivers/target/iscsi/iscsi_target_util.c     |   5 +-
> >   drivers/target/iscsi/iscsi_target_util.h     |   2 +-
> >   drivers/target/target_core_transport.c       | 102 ++++++++++++++++++---------
> >   include/target/target_core_base.h            |   1 +
> >   7 files changed, 114 insertions(+), 65 deletions(-)
> >
> Hi Nick,
> 
> I was sent this trace and to me it looks like our driver isn't getting a response back from target-core when we return -EIO
> on ibmvscsis_write_pending from a failed srp_transfer_data. This then locks us up on a wait_for_completion since our
> active_queue/scheduled_queues aren't empty. So we keep waiting, do you think there is still a leak for outgoing responses
> when our fabric driver does not return a -EAGAIN or -ENOMEM?
> 
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.123071] dev[c0000001fe6be000]: Backstore name '830530a9-2d65-317c-bea0-9ebc82739b64' is too long for INQUIRY_MODEL, truncating to 16 bytes
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592683] ibmvscsis 30000006: rdma: h_copy_rdma failed, rc -11
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592699] ibmvscsis: srp_transfer_data() failed: -11
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592705] Got unknown fabric queue status: -5
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592713] ibmvscsis 30000006: build_response: error copying to client, rc -11, flags 0x140, state 0x10
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592724] ibmvscsis: write_pending failed since: 320
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592731] ibmvscsis: write_pending failed since: 320
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592738] ibmvscsis: write_pending failed since: 320
> Jul 20 16:53:34 tuk6r1phn1 kernel: [494628.592745] ibmvscsis: write_pending failed since: 320
> Jul 20 16:53:35 tuk6r1phn1 kernel: [494628.620264] ibmvscsis: connection lost with outstanding work
> Jul 20 16:53:35 tuk6r1phn1 kernel: [494628.666964] HVCS: Closed vty-server@30000008 and partner vty@30000000:2 connection.
> <SNIP>
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428287] INFO: task kworker/1:4:6743 blocked for more than 120 seconds.
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428296]       Tainted: G           OE   4.8.0-58-generic #63~16.04.1-Ubuntu
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428298] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428301] kworker/1:4     D 0000000000000000     0  6743      2 0x00000800
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428310] Workqueue: ibmvscsis30000006 ibmvscsis_disconnect [ibmvscsis]
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428310] Call Trace:
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428313] [c0000001ae0c36f0] [c0000000014ef090] sysctl_sched_migration_cost+0x0/0x4 (unreliable)
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428316] [c0000001ae0c38c0] [c00000000001a610] __switch_to+0x280/0x3b0
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428319] [c0000001ae0c3920] [c000000000b4b6e4] __schedule+0x2f4/0x990
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428321] [c0000001ae0c3a00] [c000000000b4bdcc] schedule+0x4c/0xc0
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428322] [c0000001ae0c3a30] [c000000000b50404] schedule_timeout+0x284/0x480
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428324] [c0000001ae0c3b20] [c000000000b4cd9c] wait_for_common+0xec/0x240
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428326] [c0000001ae0c3ba0] [d000000008f66540] ibmvscsis_disconnect+0x240/0x9d0 [ibmvscsis]
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428329] [c0000001ae0c3c50] [c0000000000f1c28] process_one_work+0x1e8/0x5b0
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428330] [c0000001ae0c3ce0] [c0000000000f2098] worker_thread+0xa8/0x650
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428332] [c0000001ae0c3d80] [c0000000000fa864] kthread+0x114/0x140
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428334] [c0000001ae0c3e30] [c0000000000098f0] ret_from_kernel_thread+0x5c/0x6c
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428338] INFO: task targetcli:9627 blocked for more than 120 seconds.
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428341]       Tainted: G           OE   4.8.0-58-generic #63~16.04.1-Ubuntu
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428343] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428345] targetcli       D 00003fff9bf3fd24     0  9627   9626 0x00040000
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428347] Call Trace:
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428349] [c0000001ad647700] [c0000000014ef090] sysctl_sched_migration_cost+0x0/0x4 (unreliable)
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428351] [c0000001ad6478d0] [c00000000001a610] __switch_to+0x280/0x3b0
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428352] [c0000001ad647930] [c000000000b4b6e4] __schedule+0x2f4/0x990
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428354] [c0000001ad647a10] [c000000000b4bdcc] schedule+0x4c/0xc0
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428355] [c0000001ad647a40] [c000000000b50404] schedule_timeout+0x284/0x480
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428357] [c0000001ad647b30] [c000000000b4cd9c] wait_for_common+0xec/0x240
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428366] [c0000001ad647bb0] [d0000000079a4d60] transport_clear_lun_ref+0x50/0x70 [target_core_mod]
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428373] [c0000001ad647be0] [d00000000799d0e8] core_tpg_remove_lun+0x58/0x160 [target_core_mod]
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428379] [c0000001ad647c20] [d0000000079895fc] core_dev_del_lun+0x4c/0x150 [target_core_mod]
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428385] [c0000001ad647cb0] [d00000000798b590] target_fabric_port_unlink+0x70/0x90 [target_core_mod]
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428388] [c0000001ad647ce0] [d000000004fd5bb8] configfs_unlink+0x188/0x2c0 [configfs]
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428390] [c0000001ad647d40] [c000000000338774] vfs_unlink+0x104/0x2a0
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428392] [c0000001ad647d90] [c00000000033fe08] do_unlinkat+0x378/0x3b0
> Jul 20 16:56:32 tuk6r1phn1 kernel: [494806.428393] [c0000001ad647e30] [c000000000009584] system_call+0x38/0xec
> Jul 20 16:58:33 tuk6r1phn1 kernel: [494927.259770] INFO: task kworker/1:4:6743 blocked for more than 120 seconds.
> 
> 

On v4.10.y and earlier code without commit fa7e25cf13, returning
anything other than -EAGAIN or -ENOMEM from fabric code via
TFO->write_pending() will result in a lost response and subsequent hung
tasks..

In ibmvscsi's case, I see -EAGAIN was changed to -EIO in commit
0edcc47cdb7 to avoid the endless loop on v4.9.y code, when the
srp_transfer_data() would fail repeatably.

That said, with commit fa7e25cf13 in place, anything other than -EAGAIN
or -ENOMEM from TFO->write_pending() will reach the following code in
transport_handle_queue_full():

        /*
         * -EAGAIN or -ENOMEM signals retry of ->write_pending() and/or
         * ->queue_data_in() callbacks from new process context.
         *
         * Otherwise for other errors, transport_complete_qf() will send
         * CHECK_CONDITION via ->queue_status() instead of attempting to
         * retry associated fabric driver data-transfer callbacks.
         */
        if (err == -EAGAIN || err == -ENOMEM) {
                cmd->t_state = (write_pending) ? TRANSPORT_COMPLETE_QF_WP :
                                                 TRANSPORT_COMPLETE_QF_OK;
        } else {
                pr_warn_ratelimited("Got unknown fabric queue status: %d\n", err);
                cmd->t_state = TRANSPORT_COMPLETE_QF_ERR;
        }

which sets TRANSPORT_COMPLETE_QF_ERR, and invokes the following in
transport_complete_qf() to immediately complete se_cmd with any fabric
driver provided ->scsi_status, or return CHECK_CONDITION with logical
unit communication failure sense data if no ->scsi_status is set:

        transport_complete_task_attr(cmd);
        /*
         * If a fabric driver ->write_pending() or ->queue_data_in() callback
         * has returned neither -ENOMEM or -EAGAIN, assume it's fatal and
         * the same callbacks should not be retried.  Return CHECK_CONDITION
         * if a scsi_status is not already set.
         *
         * If a fabric driver ->queue_status() has returned non zero, always
         * keep retrying no matter what..
         */
        if (cmd->t_state == TRANSPORT_COMPLETE_QF_ERR) {
                if (cmd->scsi_status)
                        goto queue_status;

                cmd->se_cmd_flags |= SCF_EMULATED_TASK_SENSE;
                cmd->scsi_status = SAM_STAT_CHECK_CONDITION;
                cmd->scsi_sense_length  = TRANSPORT_SENSE_BUFFER;
                translate_sense_reason(cmd, TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE);
                goto queue_status;
        }

So for ibmvscsi on v4.8.y code, I'd backport commit fa7e25cf13 to your
v4.8.y tree, and if/when a specific srp_transfer_data() failure is
retriable, return -EAGAIN.  Or if the srp_transport_data() failure is
not retriable and se_cmd should be completed immediately with exception
status, return -EIO.

Btw, I'd been thinking about including commit fa7e25cf13 for stable
backports for the iser-target consumers that need it.  If it addresses
this case ibmvscsi as well on <= v4.10.y code, it's even more a reason
to add it into stable.

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html