Re: [PATCH] target: iscsi: handle abort for WRITE_PENDING cmds

Mike Christie <michael.christie@xxxxxxxxxx> · Tue, 19 Jul 2022 21:31:12 -0500

On 7/19/22 11:14 AM, Dmitry Bogdanov wrote:
> Hi Mike,
> On Mon, Jul 18, 2022 at 04:22:36PM -0500, Mike Christie wrote:
>>
>> On 7/18/22 3:45 AM, Dmitry Bogdanov wrote:
>>> Hi Mike,
>>>
>>> On Thu, Jul 14, 2022 at 11:44:25AM -0500, Mike Christie wrote:
>>>>
>>>> On 7/13/22 3:42 PM, Dmitry Bogdanov wrote:
>>>>> Sometimes an initiator does not send data for WRITE commands and tries
>>>>> to abort it. The abort hangs waiting for frontend driver completion.
>>>>> iSCSI driver waits for for data and that timeout eventually initiates
>>>>> connection reinstatment. The connection closing releases the commands in
>>>>> the connection, but those aborted commands still did not handle the
>>>>> abort and did not decrease a command ref counter. Because of that the
>>>>> connection reinstatement hangs indefinitely and prevents re-login for
>>>>> that initiator.
>>>>>
>>>>> This patch adds a handling in TCM of the abort for the WRITE_PENDING
>>>>> commands at connection closing moment to make it possible to release
>>>>> them.
>>>>>
>>>>> Signed-off-by: Dmitry Bogdanov <d.bogdanov@xxxxxxxxx>
>>>>> ---
>>>>>  drivers/target/iscsi/iscsi_target.c | 13 ++++++++++---
>>>>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
>>>>> index e368f038ff5c..27eca5e72f52 100644
>>>>> --- a/drivers/target/iscsi/iscsi_target.c
>>>>> +++ b/drivers/target/iscsi/iscsi_target.c
>>>>> @@ -26,6 +26,7 @@
>>>>>  #include <target/target_core_base.h>
>>>>>  #include <target/target_core_fabric.h>
>>>>>
>>>>> +#include <target/target_core_backend.h>
>>>>>  #include <target/iscsi/iscsi_target_core.h>
>>>>>  #include "iscsi_target_parameters.h"
>>>>>  #include "iscsi_target_seq_pdu_list.h"
>>>>> @@ -4171,7 +4172,8 @@ static void iscsit_release_commands_from_conn(struct iscsit_conn *conn)
>>>>>
>>>>>               if (se_cmd->se_tfo != NULL) {
>>>>>                       spin_lock_irq(&se_cmd->t_state_lock);
>>>>> -                     if (se_cmd->transport_state & CMD_T_ABORTED) {
>>>>> +                     if (se_cmd->t_state != TRANSPORT_WRITE_PENDING &&
>>>>> +                         se_cmd->transport_state & CMD_T_ABORTED) {
>>>>>                               /*
>>>>>                                * LIO's abort path owns the cleanup for this,
>>>>>                                * so put it back on the list and let
>>>>> @@ -4191,8 +4193,13 @@ static void iscsit_release_commands_from_conn(struct iscsit_conn *conn)
>>>>>               list_del_init(&cmd->i_conn_node);
>>>>>
>>>>>               iscsit_increment_maxcmdsn(cmd, sess);
>>>>> -             iscsit_free_cmd(cmd, true);
>>>>> -
>>>>> +             if (cmd->se_cmd.t_state == TRANSPORT_WRITE_PENDING &&
>>>>> +                 cmd->se_cmd.transport_state & CMD_T_ABORTED) {
>>>>> +                     /* handle an abort in TCM */
>>>>> +                     target_complete_cmd(&cmd->se_cmd, SAM_STAT_TASK_ABORTED);
>>>>>
>>>>
>>>> Will we have an extra ref left on the se_cmd if TAS is used so the se_cmd
>>>> does not get freed?
>>>>
>>>> For TAS, it looks like we would do:
>>>>
>>>> - target_handle_abort -> queue_status. This would not do anything because
>>>> before calling iscsit_release_commands_from_conn we have killed the iscsi tx
>>>> thread.
>>>>
>>>> - target_handle_abort -> transport_cmd_check_stop_to_fabric -> check_stop_free ->
>>>> target_put_sess_cmd.
>>>>
>>>> iscsi creates the se_cmd with TARGET_SCF_ACK_KREF set so do we have one ref
>>>> left?
>>> Yes, you are right. TAS case is not covered by my patch. But that is
>>> actually another bug (that iSCSI does not complete responses in case of
>>> connection closed).
>>
>> What do you mean this is a bug already? I mean is there a leak or spec violation?
>>
>> Spec wise we don't need to send a response to the initiator when the connection
>> is closed for a single connection session and ERL=0. We just can't because the
>> connection is down. And the initiator knows it will not be getting a response
>> because the connection is gone and cleans up on it's side.
> Looks like it is a FC term :) "Completion" there is a confirmation that
> a response has been received by a peer and a driver can free its
> resources now. A failed completion due to network error (logout for
> instance) is a completion too.
> Under "iSCSI does not complete responses in case of connection closed"
> I meant that iscsi_target does nothing if tcm_core queues a
> response/status when iscsi connection is closed already. It does not
> "complete" the queued response by decrementing kref by
> target_put_sess_cmd like in normal case.
> 
> I reproduced that bug by simple test case:
> 1. Export scsi_debug (with 30s delay) device to Initiator on 2 paths
> 2. On initiator:
>   # make the first IO do some initial IO traffic on the disk to make the
>   # second dd to send just one READ_10 command.
>   dd if=/dev/sda iflag=direct of=/dev/null bs=512 count=1
>   # start 1 IO on the first path that will hang forever eventually
>   dd if=/dev/sda iflag=direct of=/dev/null bs=512 count=1 &
>   sleep 1
>   # LUN_RESET on the second path, to make TAS feature send a response
>   # for the command from the first path
>   sg_reset -d /dev/sdb &
> 3. On target:
>   # simulate local connection reinstatement (like on DataOut timeout)
>   echo 0 > /sys/kernel/config/target/iscsi/iqn/tpg0/enable
> 
> In that scenario the connection will be already closed at the moment of
> target_handle_abort => queue_status(), iscsi_target will not free that
> cmd at the connection closing because that command is CMD_T_ABORTED and
> tcm_core will endlessly wait for "completion" of the queued response.

I get what you mean. I can replicate it with just one path and one dd.

> 
> That is that another bug that is not addressed in my patch because it is
> really another bug.
> My patch fixes only unableness of relogin (due to aborted WRITE_PENDING
> commands) that was really catched by our customers. I believe that it
> make sense to have it in 5.20.

What do you think it will take to fix the TAS part of it? I mean how long?

I think at least we should add a comment to the code and/or git commit so
if it's going to take a while other sustaining type of people seeing the
leak and hang will not waste a lot of time debugging it thinking it was
a mistake in the patch.

> 
> For RecoveryLevel > 0, I even get a crash on cmd->conn dereference at

Ah ok, even more fun.