On Wed, 2015-02-25 at 12:42 +0200, Sagi Grimberg wrote: > On 2/23/2015 10:34 AM, Nicholas A. Bellinger wrote: > > On Sun, 2015-02-22 at 18:36 +0200, Sagi Grimberg wrote: > >> On 2/21/2015 9:54 AM, Nicholas A. Bellinger wrote: <SNIP> > >> iscsit_take_action_for_connection_exit() is invoked both by RX/TX > >> threads. But only one should get to iscsit_close_connection() since > >> conn->connection_exit is set under conn->state_lock. I'd say that if > >> iscsit_close_connection() was invoked twice, the bug is in > >> iscsit_take_action_for_connection_exit() isn't it? > >> > > > > Sorry, yes. > > > > After looking at this further, I think the previous isert_cq_comp_err() > > patch still makes sense for the special logout response failure case, > > but as you've noted it does not address root cause of the original > > OOPsen. > > > > I'm now thinking it's related to complete(conn->conn_logout_comp) > > happening the start of iscsit_close_connection() (as originally intended > > for non-iser logout response failure case), that is causing > > isert_wait4logout() to immediately complete instead of allowing > > iscsit_logout_post_handler() to perform complete(conn->conn_logout_comp) > > after completion interrupt -> isert_do_control_comp() happens. > > > > This could result in iscsit_release_commands_from_conn() corrupting > > conn_cmd_list list when attempting to release the logout response > > before/during iser logout response completion interrupt handling. > > > > Here's a quick patch to test the theory. > > > > --nab > > > > diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c > > index 50bad55..ddbd022 100644 > > --- a/drivers/target/iscsi/iscsi_target.c > > +++ b/drivers/target/iscsi/iscsi_target.c > > @@ -4256,11 +4256,12 @@ int iscsit_close_connection( > > pr_debug("Closing iSCSI connection CID %hu on SID:" > > " %u\n", conn->cid, sess->sid); > > /* > > - * Always up conn_logout_comp just in case the RX Thread is sleeping > > - * and the logout response never got sent because the connection > > - * failed. > > + * Always up conn_logout_comp for the traditional TCP case just in case > > + * the RX Thread in iscsi_target_rx_opcode() is sleeping and the logout > > + * response never got sent because the connection failed. > > */ > > - complete(&conn->conn_logout_comp); > > + if (conn->conn_transport->transport_type == ISCSI_TCP) > > + complete(&conn->conn_logout_comp); > > > > iscsi_release_thread_set(conn); > > > > This does seem to make the list corruption go away. Thanks for the test feedback. This patch is queued in target-pending/master. > I increased the > session count to ~120 doing login/logout loop and at some point I am in > a point where I have 16066 iscsi_ttx and 16064 iscsi_trx threads > causing me to fail any other kthread creation (see dump_stack). > > CPU: 12 PID: 22517 Comm: iscsi_ttx Tainted: G E 3.19.0-rc1+ #34 > Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 > 0000000000000000 ffff8804469dfdc8 ffffffff8153805c ffff8804469dfe58 > ffff8803a5e05700 ffff8804469dfdf8 ffffffffa053b4fe ffff8803b1e95028 > ffff8803b1e95000 ffff8803b1e95000 ffff8804469dfe58 ffff8804469dfe08 > Call Trace: > [<ffffffff8153805c>] dump_stack+0x48/0x5c > [<ffffffffa053b4fe>] iscsi_allocate_thread_sets+0x21e/0x280 > [iscsi_target_mod] > [<ffffffffa053b59a>] iscsi_check_to_add_additional_sets+0x3a/0x40 > [iscsi_target_mod] > [<ffffffffa053b691>] iscsi_tx_thread_pre_handler+0xf1/0x170 > [iscsi_target_mod] > [<ffffffffa054e0a7>] iscsi_target_tx_thread+0x47/0x220 [iscsi_target_mod] > [<ffffffff81538493>] ? __schedule+0x333/0x620 > [<ffffffffa054e060>] ? iscsit_handle_snack+0x180/0x180 [iscsi_target_mod] > [<ffffffff8106ac5e>] kthread+0xce/0xf0 > [<ffffffff8106ab90>] ? kthread_freezable_should_stop+0x70/0x70 > [<ffffffff8153beec>] ret_from_fork+0x7c/0xb0 > [<ffffffff8106ab90>] ? kthread_freezable_should_stop+0x70/0x70 > Unable to start iscsi_target_tx_thread > > For some reason the iscsi extra thread sets are not cleaned up well > and/or not reused from inactive list... > Please revert commit 72859d91, as it's incorrect per your earlier comments wrt iscsit_close_connection() never being called more than once during explicit shutdown. The same has been done in target-pending/master here: https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=796cdd654e2b12010f3443aa039e4385906eacad Apologies, for not being explicit about this earlier. --nab -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html