Re: [PATCH 5/7] scsi: target: iscsit/isert: stop/wait on cmds during conn close

Mike Christie <michael.christie@xxxxxxxxxx> · Sat, 10 Dec 2022 12:48:47 -0600

On 12/9/22 6:32 AM, Dmitry Bogdanov wrote:
> On Wed, Dec 07, 2022 at 09:10:00PM -0600, Mike Christie wrote:
>>
>> This fixes 2 bugs added in:
>>
>> commit f36199355c64 ("scsi: target: iscsi: Fix cmd abort fabric stop
>> race")
>>
>> If we have multiple sessions to the same se_device we can hit a race where
>> a LUN_RESET on one session cleans up the se_cmds from under another
>> session which is being closed. This results in the closing session freeing
>> its conn/session structs while they are still in use.
>>
>> The bug is:
>>
>> 1. Session1 has IO se_cmd1.
>> 2. Session2 can also have se_cmds for IO and optionally TMRs for ABORTS
>> but then gets a LUN_RESET.
>> 3. The LUN_RESET on session2 sees the se_cmds on session1 and during
>> the drain stages marks them all with CMD_T_ABORTED.
>> 4. session1 is now closed so iscsit_release_commands_from_conn only sees
>> se_cmds with the CMD_T_ABORTED bit set and returns immediately even
>> though we have outstanding commands.
>> 5. session1's connection and session are freed.
>> 6. The backend request for se_cmd1 completes and it accesses the freed
>> connection/session.
>>
>> If session1 was executing only IO se_cmds and TAS is set on the se_cmd,
>> then we need to do a iscsit_free_cmd on those commands, so we wait on
>> their completion from LIO core and the backend.
>>
>> If session1 was waiting on tmr se_cmds or TAS is not set then we need to
>> wait for those outstanding se_cmds to have their last put done so we
>> know no user is still accessing them when we free the session/conn.
>>
>> This fixes the TAS set case, by adding a check so if we hit it we now call
>> iscsit_free_cmd. To handle the tmr se_cd and non TAS case, it hooks the
>> iscsit layer into the cmd counter code, so we can wait for all outstanding
>> commands before freeing the connection and possibly the session.
>>
>> Fixes: f36199355c64 ("scsi: target: iscsi: Fix cmd abort fabric stop race")
>> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx>
>> ---
>>  drivers/infiniband/ulp/isert/ib_isert.c | 13 +------------
>>  drivers/target/iscsi/iscsi_target.c     | 13 ++++++++++++-
>>  drivers/target/target_core_transport.c  |  6 ++++--
>>  include/target/target_core_fabric.h     |  2 ++
>>  4 files changed, 19 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
>> index b360a1527cd1..600059d8a3a7 100644
>> --- a/drivers/infiniband/ulp/isert/ib_isert.c
>> +++ b/drivers/infiniband/ulp/isert/ib_isert.c
>> @@ -2501,17 +2501,6 @@ isert_wait4logout(struct isert_conn *isert_conn)
>>         }
>>  }
>>
>> -static void
>> -isert_wait4cmds(struct iscsit_conn *conn)
>> -{
>> -       isert_info("iscsit_conn %p\n", conn);
>> -
>> -       if (conn->sess) {
>> -               target_stop_session(conn->sess->se_sess);
>> -               target_wait_for_sess_cmds(conn->sess->se_sess);
>> -       }
>> -}
>> -
>>  /**
>>   * isert_put_unsol_pending_cmds() - Drop commands waiting for
>>   *     unsolicitate dataout
>> @@ -2559,7 +2548,7 @@ static void isert_wait_conn(struct iscsit_conn *conn)
>>
>>         ib_drain_qp(isert_conn->qp);
>>         isert_put_unsol_pending_cmds(conn);
>> -       isert_wait4cmds(conn);
>> +       target_wait_for_cmds(conn->cmd_cnt);
>>         isert_wait4logout(isert_conn);
>>
>>         queue_work(isert_release_wq, &isert_conn->release_work);
>> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
>> index 7a8ffdf33bee..1c3470e4b50c 100644
>> --- a/drivers/target/iscsi/iscsi_target.c
>> +++ b/drivers/target/iscsi/iscsi_target.c
>> @@ -4221,7 +4221,8 @@ static void iscsit_release_commands_from_conn(struct iscsit_conn *conn)
>>
>>                 if (se_cmd->se_tfo != NULL) {
>>                         spin_lock_irq(&se_cmd->t_state_lock);
>> -                       if (se_cmd->transport_state & CMD_T_ABORTED) {
>> +                       if (se_cmd->transport_state & CMD_T_ABORTED &&
>> +                           !(se_cmd->transport_state & CMD_T_TAS)) {
>>                                 /*
>>                                  * LIO's abort path owns the cleanup for this,
>>                                  * so put it back on the list and let
> 
> Could you please extract ths snippet (fix of the hanged commands with
> TAS) to a separate patch? It looks good.

Yeah.

> 
>> @@ -4244,6 +4245,14 @@ static void iscsit_release_commands_from_conn(struct iscsit_conn *conn)
>>                 iscsit_free_cmd(cmd, true);
>>
>>         }
>> +
>> +       /*
>> +        * Wait on commands that were cleaned up via the aborted_task path.
>> +        * LLDs that implement iscsit_wait_conn will already have waited for
>> +        * commands.
>> +        */
>> +       if (!conn->conn_transport->iscsit_wait_conn)
>> +               target_wait_for_cmds(conn->cmd_cnt);
>>  }
>>
>>  static void iscsit_stop_timers_for_cmds(
>> @@ -4304,6 +4313,8 @@ int iscsit_close_connection(
>>         iscsit_stop_nopin_response_timer(conn);
>>         iscsit_stop_nopin_timer(conn);
>>
>> +       target_stop_cmd_counter(conn->cmd_cnt);
>> +
>>         if (conn->conn_transport->iscsit_wait_conn)
>>                 conn->conn_transport->iscsit_wait_conn(conn);
> 
> I strongly believe that waiting for commands complete before decreasing
> the command refcounter is useless and leads to hangings.
> There was a several tries to wait for the commands complete in the
> session. But all of them were eventually reverted due to iSER [1].
> [1] https://lore.kernel.org/all/CH2PR12MB4005D671F3D274C4D5FA0BAEDD1C0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> 

Yeah, I saw those. It's why we have the stop+wait split and why I left
the isert target_wait_session where it was.

> Let's try it one more time - move conn->conn_transport->iscsit_wait_conn(conn)
> to the end of iscsit_release_commands_from_conn() to align iser with other
> iscsi transports.
> 
> Probably, to have target_wait_for_cmds as a default .iscsit_wait_conn
> implementation would be the best way.
> 

I think we have to do that in another patchset because this was meant
to just fix the 2 regressions (that cleanup patch does not have conflicts
so no need to backport it). What you want to do is not going to be as
simple as just moving that call around so we can't sneak it in.

The iscsit_wait_conn is where it is because it does 2 things:

1. The issue with Bart's patches was isert doesn't submit commands to lio
from the iscsi rx thread so killing it or running iscsit_close_connection
from it doesn't prevent new commands from coming in like it does for iscsi.
For isert you have to disconnect the connection before cleaning up so we
don't get new commands while/after cleaning up.

2. Because isert submits from the ib context we have to do a flush like call
to make sure it's not still running similar to how for iscsi we kill the
rx thread then do the cleanup from the tx thread (or the reverse depending
on which thread starts the recovery). So the target_stop/wait_session did
that for the driver.

The current code works for iscsit because we send/recv from the tx/rx threads.
If we recvd from the sk_data_ready callback then we would have a similar
problem as isert.

So we have to do:

1. Add a new callout close_connection which for isert does this
initial disconnect cleanup and is called before iscsit_release_commands_from_conn.
Bart's patch added a close callout but called it was called too late.

2. The target_stop_session calls acts as flush right now. When it returns
we know the ib layer is not calling into the iscsi/lio layers anymore. We need
something to replace this.

3. We can then do target_wait_for_cmds after iscsit_release_commands_from_conn
for both drivers, but, we might need some more changes. See below.

When we do iscsit iscsit_release_commands_from_conn we are:

1. Waiting on commands in the backend and LIO core.
2. Doing the last put on commands that have had queue_status called but
we haven't freed the cmd because they haven't been ackd.

Are we hitting an issue with #2? We need a proper bug and analysis or we are
just guessing and am going to mess up.

For example, for isert is the bug you are worried about that we have a missing
isert_send_done/isert_completion_put call because we disconnected before the
send callbacks could be done or because the ib layer won't call isert_send_done
when it detects a failure? If so then yeah, the current code is going to hang
since nothing is going to do the last put on the session since that's done later.
To handle that we will have to move code around so we do isert_unmap_tx_desc from
iscsit_unmap_cmd or do something so we don't leak those resources.