Re: [PATCH 5/7] scsi: target: iscsit/isert: stop/wait on cmds during conn close

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 07, 2022 at 09:10:00PM -0600, Mike Christie wrote:
> 
> This fixes 2 bugs added in:
> 
> commit f36199355c64 ("scsi: target: iscsi: Fix cmd abort fabric stop
> race")
> 
> If we have multiple sessions to the same se_device we can hit a race where
> a LUN_RESET on one session cleans up the se_cmds from under another
> session which is being closed. This results in the closing session freeing
> its conn/session structs while they are still in use.
> 
> The bug is:
> 
> 1. Session1 has IO se_cmd1.
> 2. Session2 can also have se_cmds for IO and optionally TMRs for ABORTS
> but then gets a LUN_RESET.
> 3. The LUN_RESET on session2 sees the se_cmds on session1 and during
> the drain stages marks them all with CMD_T_ABORTED.
> 4. session1 is now closed so iscsit_release_commands_from_conn only sees
> se_cmds with the CMD_T_ABORTED bit set and returns immediately even
> though we have outstanding commands.
> 5. session1's connection and session are freed.
> 6. The backend request for se_cmd1 completes and it accesses the freed
> connection/session.
> 
> If session1 was executing only IO se_cmds and TAS is set on the se_cmd,
> then we need to do a iscsit_free_cmd on those commands, so we wait on
> their completion from LIO core and the backend.
> 
> If session1 was waiting on tmr se_cmds or TAS is not set then we need to
> wait for those outstanding se_cmds to have their last put done so we
> know no user is still accessing them when we free the session/conn.
> 
> This fixes the TAS set case, by adding a check so if we hit it we now call
> iscsit_free_cmd. To handle the tmr se_cd and non TAS case, it hooks the
> iscsit layer into the cmd counter code, so we can wait for all outstanding
> commands before freeing the connection and possibly the session.
> 
> Fixes: f36199355c64 ("scsi: target: iscsi: Fix cmd abort fabric stop race")
> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx>
> ---
>  drivers/infiniband/ulp/isert/ib_isert.c | 13 +------------
>  drivers/target/iscsi/iscsi_target.c     | 13 ++++++++++++-
>  drivers/target/target_core_transport.c  |  6 ++++--
>  include/target/target_core_fabric.h     |  2 ++
>  4 files changed, 19 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
> index b360a1527cd1..600059d8a3a7 100644
> --- a/drivers/infiniband/ulp/isert/ib_isert.c
> +++ b/drivers/infiniband/ulp/isert/ib_isert.c
> @@ -2501,17 +2501,6 @@ isert_wait4logout(struct isert_conn *isert_conn)
>         }
>  }
> 
> -static void
> -isert_wait4cmds(struct iscsit_conn *conn)
> -{
> -       isert_info("iscsit_conn %p\n", conn);
> -
> -       if (conn->sess) {
> -               target_stop_session(conn->sess->se_sess);
> -               target_wait_for_sess_cmds(conn->sess->se_sess);
> -       }
> -}
> -
>  /**
>   * isert_put_unsol_pending_cmds() - Drop commands waiting for
>   *     unsolicitate dataout
> @@ -2559,7 +2548,7 @@ static void isert_wait_conn(struct iscsit_conn *conn)
> 
>         ib_drain_qp(isert_conn->qp);
>         isert_put_unsol_pending_cmds(conn);
> -       isert_wait4cmds(conn);
> +       target_wait_for_cmds(conn->cmd_cnt);
>         isert_wait4logout(isert_conn);
> 
>         queue_work(isert_release_wq, &isert_conn->release_work);
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index 7a8ffdf33bee..1c3470e4b50c 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -4221,7 +4221,8 @@ static void iscsit_release_commands_from_conn(struct iscsit_conn *conn)
> 
>                 if (se_cmd->se_tfo != NULL) {
>                         spin_lock_irq(&se_cmd->t_state_lock);
> -                       if (se_cmd->transport_state & CMD_T_ABORTED) {
> +                       if (se_cmd->transport_state & CMD_T_ABORTED &&
> +                           !(se_cmd->transport_state & CMD_T_TAS)) {
>                                 /*
>                                  * LIO's abort path owns the cleanup for this,
>                                  * so put it back on the list and let

Could you please extract ths snippet (fix of the hanged commands with
TAS) to a separate patch? It looks good.

> @@ -4244,6 +4245,14 @@ static void iscsit_release_commands_from_conn(struct iscsit_conn *conn)
>                 iscsit_free_cmd(cmd, true);
> 
>         }
> +
> +       /*
> +        * Wait on commands that were cleaned up via the aborted_task path.
> +        * LLDs that implement iscsit_wait_conn will already have waited for
> +        * commands.
> +        */
> +       if (!conn->conn_transport->iscsit_wait_conn)
> +               target_wait_for_cmds(conn->cmd_cnt);
>  }
> 
>  static void iscsit_stop_timers_for_cmds(
> @@ -4304,6 +4313,8 @@ int iscsit_close_connection(
>         iscsit_stop_nopin_response_timer(conn);
>         iscsit_stop_nopin_timer(conn);
> 
> +       target_stop_cmd_counter(conn->cmd_cnt);
> +
>         if (conn->conn_transport->iscsit_wait_conn)
>                 conn->conn_transport->iscsit_wait_conn(conn);

I strongly believe that waiting for commands complete before decreasing
the command refcounter is useless and leads to hangings.
There was a several tries to wait for the commands complete in the
session. But all of them were eventually reverted due to iSER [1].
[1] https://lore.kernel.org/all/CH2PR12MB4005D671F3D274C4D5FA0BAEDD1C0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Let's try it one more time - move conn->conn_transport->iscsit_wait_conn(conn)
to the end of iscsit_release_commands_from_conn() to align iser with other
iscsi transports.

Probably, to have target_wait_for_cmds as a default .iscsit_wait_conn
implementation would be the best way.

> 
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 90e3b1aef1f1..8bbf0c834b74 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -3174,13 +3174,14 @@ static void target_stop_cmd_counter_confirm(struct percpu_ref *ref)
>   * target_stop_cmd_counter - Stop new IO from being added to the counter.
>   * @cmd_cnt: counter to stop
>   */
> -static void target_stop_cmd_counter(struct target_cmd_counter *cmd_cnt)
> +void target_stop_cmd_counter(struct target_cmd_counter *cmd_cnt)
>  {
>         pr_debug("Stopping command counter.\n");
>         if (!atomic_cmpxchg(&cmd_cnt->stopped, 0, 1))
>                 percpu_ref_kill_and_confirm(&cmd_cnt->refcnt,
>                                             target_stop_cmd_counter_confirm);
>  }
> +EXPORT_SYMBOL_GPL(target_stop_cmd_counter);
> 
>  /**
>   * target_stop_session - Stop new IO from being queued on the session.
> @@ -3196,7 +3197,7 @@ EXPORT_SYMBOL(target_stop_session);
>   * target_wait_for_cmds - Wait for outstanding cmds.
>   * @cmd_cnt: counter to wait for active I/O for.
>   */
> -static void target_wait_for_cmds(struct target_cmd_counter *cmd_cnt)
> +void target_wait_for_cmds(struct target_cmd_counter *cmd_cnt)
>  {
>         int ret;
> 
> @@ -3212,6 +3213,7 @@ static void target_wait_for_cmds(struct target_cmd_counter *cmd_cnt)
>         wait_for_completion(&cmd_cnt->stop_done);
>         pr_debug("Waiting for cmds done.\n");
>  }
> +EXPORT_SYMBOL_GPL(target_wait_for_cmds);
> 
>  /**
>   * target_wait_for_sess_cmds - Wait for outstanding commands
> diff --git a/include/target/target_core_fabric.h b/include/target/target_core_fabric.h
> index 4cbfb532a431..b188b1e90e1e 100644
> --- a/include/target/target_core_fabric.h
> +++ b/include/target/target_core_fabric.h
> @@ -133,6 +133,8 @@ struct se_session *target_setup_session(struct se_portal_group *,
>                                 struct se_session *, void *));
>  void target_remove_session(struct se_session *);
> 
> +void target_stop_cmd_counter(struct target_cmd_counter *cmd_cnt);
> +void target_wait_for_cmds(struct target_cmd_counter *cmd_cnt);
>  struct target_cmd_counter *target_alloc_cmd_counter(void);
>  void target_free_cmd_counter(struct target_cmd_counter *cmd_cnt);
> 
> --
> 2.25.1
> 
> 




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux