Re: iscsi_trx going into D state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/04/2016 09:55 AM, Johannes Thumshirn wrote:
> On Fri, Sep 30, 2016 at 11:14:57AM -0600, Robert LeBlanc wrote:
>> We are having a reoccurring problem where iscsi_trx is going into D
>> state. It seems like it is waiting for a session tear down to happen
>> or something, but keeps waiting. We have to reboot these targets on
>> occasion. This is running the 4.4.12 kernel and we have seen it on
>> several previous 4.4.x and 4.2.x kernels. There is no message in dmesg
>> or /var/log/messages. This seems to happen with increased frequency
>> when we have a disruption in our Infiniband fabric, but can happen
>> without any changes to the fabric (other than hosts rebooting).
>>
>> # ps aux | grep iscsi | grep D
>> root      4185  0.0  0.0      0     0 ?        D    Sep29   0:00 [iscsi_trx]
>> root     18505  0.0  0.0      0     0 ?        D    Sep29   0:00 [iscsi_np]
>>
>> # cat /proc/4185/stack
>> [<ffffffff814cc999>] target_wait_for_sess_cmds+0x49/0x1a0
>> [<ffffffffa087292b>] isert_wait_conn+0x1ab/0x2f0 [ib_isert]
>> [<ffffffff814f0de2>] iscsit_close_connection+0x162/0x840
>> [<ffffffff814df8df>] iscsit_take_action_for_connection_exit+0x7f/0x100
>> [<ffffffff814effc0>] iscsi_target_rx_thread+0x5a0/0xe80
>> [<ffffffff8109c6f8>] kthread+0xd8/0xf0
>> [<ffffffff8172004f>] ret_from_fork+0x3f/0x70
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> # cat /proc/18505/stack
>> [<ffffffff814f0c71>] iscsit_stop_session+0x1b1/0x1c0
>> [<ffffffff814e2436>] iscsi_check_for_session_reinstatement+0x1e6/0x270
>> [<ffffffff814e4df0>] iscsi_target_check_for_existing_instances+0x30/0x40
>> [<ffffffff814e4f40>] iscsi_target_do_login+0x140/0x640
>> [<ffffffff814e62dc>] iscsi_target_start_negotiation+0x1c/0xb0
>> [<ffffffff814e402b>] iscsi_target_login_thread+0xa9b/0xfc0
>> [<ffffffff8109c6f8>] kthread+0xd8/0xf0
>> [<ffffffff8172004f>] ret_from_fork+0x3f/0x70
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> What can we do to help get this resolved?
>>
>> Thanks,
>>
>> ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Hi,
> I've encountered the same issue and found a hack to fix it at [1] but I think
> the correct way for handling this issue would be like you said to tear down 
> the session in case a TASK ABORT times out. Unfortunately I'm not really
> familiar with the target code myself so I mainly use this reply to get me into
> the Cc loop.
> 
> [1] http://marc.info/?l=linux-scsi&m=147282568910535&w=2
> 
> 
Hmm. Looking at the code it looks as we might miss some calls to
'complete'. Can you try with the attached patch?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
>From d481d8c27df8c09ea3798ce4a7217a26c3533161 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@xxxxxxx>
Date: Tue, 4 Oct 2016 11:05:46 +0200
Subject: [PATCH] iscsi_target: sanitze sess_wait_on_completion

When closing a session we only should set 'sess_wait_on_completion'
if we are actually calling wait_for_completion(). And we should indeed
call 'complete' in these cases, too.
And add some WARN_ON() if we mess up with calculating the number
of completions, too.

Signed-off-by: Hannes Reinecke <hare@xxxxxxxx>
---
 drivers/target/iscsi/iscsi_target.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 39b928c..313724c 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -4287,6 +4287,7 @@ int iscsit_close_connection(
 	if (!atomic_read(&sess->session_reinstatement) &&
 	     atomic_read(&sess->session_fall_back_to_erl0)) {
 		spin_unlock_bh(&sess->conn_lock);
+		WARN_ON(atomic_read(&sess->sleep_on_sess_wait_comp));
 		iscsit_close_session(sess);
 
 		return 0;
@@ -4557,7 +4558,6 @@ int iscsit_free_session(struct iscsi_session *sess)
 	int is_last;
 
 	spin_lock_bh(&sess->conn_lock);
-	atomic_set(&sess->sleep_on_sess_wait_comp, 1);
 
 	list_for_each_entry_safe(conn, conn_tmp, &sess->sess_conn_list,
 			conn_list) {
@@ -4585,7 +4585,10 @@ int iscsit_free_session(struct iscsi_session *sess)
 
 	if (atomic_read(&sess->nconn)) {
 		spin_unlock_bh(&sess->conn_lock);
+		atomic_inc(&sess->sleep_on_sess_wait_comp);
 		wait_for_completion(&sess->session_wait_comp);
+		atomic_dec(&sess->sleep_on_sess_wait_comp);
+		WARN_ON(atomic_read(&sess->sleep_on_sess_wait_comp));
 	} else
 		spin_unlock_bh(&sess->conn_lock);
 
@@ -4603,8 +4606,6 @@ void iscsit_stop_session(
 	int is_last;
 
 	spin_lock_bh(&sess->conn_lock);
-	if (session_sleep)
-		atomic_set(&sess->sleep_on_sess_wait_comp, 1);
 
 	if (connection_sleep) {
 		list_for_each_entry_safe(conn, conn_tmp, &sess->sess_conn_list,
@@ -4636,7 +4637,10 @@ void iscsit_stop_session(
 
 	if (session_sleep && atomic_read(&sess->nconn)) {
 		spin_unlock_bh(&sess->conn_lock);
+		atomic_inc(&sess->sleep_on_sess_wait_comp);
 		wait_for_completion(&sess->session_wait_comp);
+		atomic_dec(&sess->sleep_on_sess_wait_comp);
+		WARN_ON(atomic_read(&sess->sleep_on_sess_wait_comp);
 	} else
 		spin_unlock_bh(&sess->conn_lock);
 }
-- 
2.6.6


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux