Re: BUG in stress login-logout to multiple IQNs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/26/2015 9:47 AM, Nicholas A. Bellinger wrote:
On Wed, 2015-02-25 at 12:42 +0200, Sagi Grimberg wrote:
On 2/23/2015 10:34 AM, Nicholas A. Bellinger wrote:
On Sun, 2015-02-22 at 18:36 +0200, Sagi Grimberg wrote:
On 2/21/2015 9:54 AM, Nicholas A. Bellinger wrote:

<SNIP>

iscsit_take_action_for_connection_exit() is invoked both by RX/TX
threads. But only one should get to iscsit_close_connection() since
conn->connection_exit is set under conn->state_lock. I'd say that if
iscsit_close_connection() was invoked twice, the bug is in
iscsit_take_action_for_connection_exit() isn't it?


Sorry, yes.

After looking at this further, I think the previous isert_cq_comp_err()
patch still makes sense for the special logout response failure case,
but as you've noted it does not address root cause of the original
OOPsen.

I'm now thinking it's related to complete(conn->conn_logout_comp)
happening the start of iscsit_close_connection() (as originally intended
for non-iser logout response failure case), that is causing
isert_wait4logout() to immediately complete instead of allowing
iscsit_logout_post_handler() to perform complete(conn->conn_logout_comp)
after completion interrupt -> isert_do_control_comp() happens.

This could result in iscsit_release_commands_from_conn() corrupting
conn_cmd_list list when attempting to release the logout response
before/during iser logout response completion interrupt handling.

Here's a quick patch to test the theory.

--nab

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 50bad55..ddbd022 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -4256,11 +4256,12 @@ int iscsit_close_connection(
          pr_debug("Closing iSCSI connection CID %hu on SID:"
                  " %u\n", conn->cid, sess->sid);
          /*
-        * Always up conn_logout_comp just in case the RX Thread is sleeping
-        * and the logout response never got sent because the connection
-        * failed.
+        * Always up conn_logout_comp for the traditional TCP case just in case
+        * the RX Thread in iscsi_target_rx_opcode() is sleeping and the logout
+        * response never got sent because the connection failed.
           */
-       complete(&conn->conn_logout_comp);
+       if (conn->conn_transport->transport_type == ISCSI_TCP)
+               complete(&conn->conn_logout_comp);

          iscsi_release_thread_set(conn);


This does seem to make the list corruption go away.

Thanks for the test feedback.

This patch is queued in target-pending/master.

I increased the
session count to ~120 doing login/logout loop and at some point I am in
a point where I have 16066 iscsi_ttx and 16064 iscsi_trx threads
causing me to fail any other kthread creation (see dump_stack).

CPU: 12 PID: 22517 Comm: iscsi_ttx Tainted: G            E  3.19.0-rc1+ #34
Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013
0000000000000000 ffff8804469dfdc8 ffffffff8153805c ffff8804469dfe58
ffff8803a5e05700 ffff8804469dfdf8 ffffffffa053b4fe ffff8803b1e95028
ffff8803b1e95000 ffff8803b1e95000 ffff8804469dfe58 ffff8804469dfe08
Call Trace:
[<ffffffff8153805c>] dump_stack+0x48/0x5c
[<ffffffffa053b4fe>] iscsi_allocate_thread_sets+0x21e/0x280
[iscsi_target_mod]
[<ffffffffa053b59a>] iscsi_check_to_add_additional_sets+0x3a/0x40
[iscsi_target_mod]
[<ffffffffa053b691>] iscsi_tx_thread_pre_handler+0xf1/0x170
[iscsi_target_mod]
[<ffffffffa054e0a7>] iscsi_target_tx_thread+0x47/0x220 [iscsi_target_mod]
[<ffffffff81538493>] ? __schedule+0x333/0x620
[<ffffffffa054e060>] ? iscsit_handle_snack+0x180/0x180 [iscsi_target_mod]
[<ffffffff8106ac5e>] kthread+0xce/0xf0
[<ffffffff8106ab90>] ? kthread_freezable_should_stop+0x70/0x70
[<ffffffff8153beec>] ret_from_fork+0x7c/0xb0
[<ffffffff8106ab90>] ? kthread_freezable_should_stop+0x70/0x70
Unable to start iscsi_target_tx_thread

For some reason the iscsi extra thread sets are not cleaned up well
and/or not reused from inactive list...


Please revert commit 72859d91, as it's incorrect per your earlier
comments wrt iscsit_close_connection() never being called more than once
during explicit shutdown.

This reproduces with commit 72859d91 reverted.
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux