Re: BUG in stress login-logout to multiple IQNs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sagi,

Apologies for the delayed response..  Comments below.

On Sat, 2015-02-14 at 12:21 +0200, Sagi Grimberg wrote:
> On 2/12/2015 10:47 AM, Nicholas A. Bellinger wrote:
> > On Wed, 2015-02-11 at 10:17 +0200, Sagi Grimberg wrote:
> >> Hey Nic,
> >>
> >> So Our QA guys recently stepped on this bug when performing stress
> >> login-logout from a single initiator to 10 targets each exposed over
> >> 4 portals, so overall 40 sessions (needless to say we are talking on
> >> iser...). So there are lots of logins in parallel with lots of logouts.
> >>
> >> It seems that the connection termination causes iscsi_tx_thread to
> >> access the connection after it is freed or something (list corruption
> >> probably coming from iscsit_handle_immediate_queue or
> >> iscsit_handle_response_queue, and NULL deref coming from
> >> iscsit_take_action_for_connection_exit).
> >>
> >> Note, isert_wait_conn waits for session commands and QP flush which is
> >> normally pretty fast, the conn termination is done in a work that waits
> >> for DISCONNECTED event which might take longer (which is why we do it
> >> outside wait_conn context to avoid blocking it).
> >>
> >> I didn't get too far with this until now, do you have any idea on what
> >> might have happened?
> >
> > Mmm, it looks like iscsit_take_action_for_connection_exit() in TX thread
> > context is calling iscsi_close_connection() after hitting the following
> > check in iscsi_target_erl0.c:
> >
> >          if (conn->conn_state == TARG_CONN_STATE_IN_LOGOUT) {
> >                  spin_unlock_bh(&conn->state_lock);
> >                  iscsit_close_connection(conn);
> >                  return;
> >          }
> >
> > .. once iscsi_close_connection() has already being called earlier by
> > iser-target code.
> 
> Not sure I understand where iscsit_close_connection is called earlier
> by iser target. The iser code usually only notifies any problems to the
> iscsi layer to do it's thing.
> 
> Care to explain how iscsit_close_connection might be called twice?
> 

It appears iscsit_close_connection() is getting invoked first from
iscsi_trx context after isert_cq_comp_err() has previously called
iscsit_cause_connection_reinstatement() to force a connection failure to
occur during explicit logout + ISCSI_LOGOUT_REASON_CLOSE_SESSION
operation.

You can tell because isert_wait_conn() + isert_wait4cmds() debug output
appears before list_del corruption in iscsi_ttx context, which can only
be invoked via iscsit_close_connection() -> transport->wait_for_conn()
-> isert_wait_conn().

Once iscsi_ttx context runs, it's hitting the TARG_CONN_STATE_IN_LOGOUT
state check in iscsit_take_action_for_connection_exit() and re-invokes
iscsit_close_connection(), after iscsit_logout_closesession() from
isert_rx_completion() context handles REASON_CLOSE_SESSION and changed
connection state to IN_LOGOUT, but before the logout response was posted
and successfully completed in isert_do_control_comp().

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux