[PATCH] CIFS: Fix race condition on RFC1002_NEGATIVE_SESSION_RESPONSE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greetings,


I have been running into an issue (kernel v3.10.40) when connecting to an NT 3.51 Workstation host with a configuration missing the NetBIOS name ('servern' option.) Under some conditions, the mount helper would hang forever in an uninterruptible sleep state. The mount helper comes from busybox. Under some other conditions the mount program would exit with an error and not hang.

I managed to create a setup where I could reliably reproduce both cases (the "good case" where the program exits, and the "bad case" where the program hangs forever.)

The problem seems to be a race condition between the demux thread and the "main"(?) thread over the srv_mutex. Here is the summary of the functions calls that lead to this problem:

* demux thread:
cifs_demultiplex_thread()
  is_smb_response()
    [connect.c:626 -- case RFC1002_NEGATIVE_SESSION_RESPONSE]
      cifs_reconnect()
        [connect.c:380]
        do {
          mutex_lock(&server->srv_mutex);
          generic_ip_connect(server);
          // on error -> msleep(3000);
          mutex_unlock(&server->srv_mutex);
        } while (server->tcpStatus == CifsNeedReconnect);

* "main" thread:
cifs_negotiate()
  CIFSSMBNegotiate()
    SendReceive()
      [transport.c:821 - thread hangs forever]
      mutex_lock(&ses->server->srv_mutex);

Another interesting piece of information is that in the good case at
cifs_reconnect(), generic_ip_connect() returns -EINTR, whereas in the bad case it returns -ECONNREFUSED. In the bad case it all leads to generic_ip_connect() being called over and over again with the same result, but never exiting the loop (thus: hanging.)

The following patch works around the issue by not re-attempting the SMB negotiation:

---8<----------------------------------------------------------------------
diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
index 4885a40..863a2da 100644
--- a/fs/cifs/smb1ops.c
+++ b/fs/cifs/smb1ops.c
@@ -415,10 +415,12 @@ cifs_negotiate(const unsigned int xid, struct cifs_ses *ses)
        int rc;
        rc = CIFSSMBNegotiate(xid, ses);
        if (rc == -EAGAIN) {
+#if 0
                /* retry only once on 1st time connection */
                set_credits(ses->server, 1);
                rc = CIFSSMBNegotiate(xid, ses);
                if (rc == -EAGAIN)
+#endif
                        rc = -EHOSTDOWN;
        }
        return rc;
---------------------------------------------------------------------->8---

I was able, however, to identify a (hopefully) better solution for the issue (see the attached patch.)

I would really appreciate your feedback on the attached patch. Please let me know if the solution seems acceptable as well as side-effects-free. We use CIFS to connect to older Windows systems and we have been experiencing similar issues for a while now (which I hope to solve with this patch.)

Thanks a lot in advance! :-)


Kind regards,


Federico Sauter
Senior Firmware Programmer
--
Innominate Security Technologies AG
Rudower Chaussee 13 | 12489 Berlin | Germany
tel: +49 30 921028-210 | fax: +49 30 921028-020
www.innominate.com | www.twitter.com/mGuardcom

Register Court: AG Charlottenburg, HR B 81603
Management Board: Dirk Seewald | Chairman of the Supervisory Board: Christoph Leifer
>From 54e3c95a4646c8666c6f08766250fd056b06e7f5 Mon Sep 17 00:00:00 2001
From: Federico Sauter <fsauter@xxxxxxxxxxxxxx>
Date: Tue, 17 Mar 2015 17:45:28 +0100
Subject: [PATCH] CIFS: Fix race condition on
 RFC1002_NEGATIVE_SESSION_RESPONSE

This patch fixes a race condition that occurs when connecting
to a NT 3.51 host without specifying a NetBIOS name.
In that case a RFC1002_NEGATIVE_SESSION_RESPONSE is received
and the SMB negotiation is reattempted, but under some conditions
it leads SendReceive() to hang forever while waiting for srv_mutex.
This, in turn, sets the calling process to an uninterruptible sleep
state and makes it unkillable.

The solution is to unlock the srv_mutex acquired in the demux
thread *before* going to sleep (after the reconnect error) and
before reattempting the connection.
---
 fs/cifs/connect.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index d05a300..a45e7fc 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -381,6 +381,7 @@ cifs_reconnect(struct TCP_Server_Info *server)
 		rc = generic_ip_connect(server);
 		if (rc) {
 			cifs_dbg(FYI, "reconnect error %d\n", rc);
+			mutex_unlock(&server->srv_mutex);
 			msleep(3000);
 		} else {
 			atomic_inc(&tcpSesReconnectCount);
@@ -388,8 +389,8 @@ cifs_reconnect(struct TCP_Server_Info *server)
 			if (server->tcpStatus != CifsExiting)
 				server->tcpStatus = CifsNeedNegotiate;
 			spin_unlock(&GlobalMid_Lock);
+			mutex_unlock(&server->srv_mutex);
 		}
-		mutex_unlock(&server->srv_mutex);
 	} while (server->tcpStatus == CifsNeedReconnect);
 
 	return rc;
-- 
1.7.10.4


[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux