Since MaxMpx is a per smb socket restriction, we probably want something like: diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 1a9fe7f..1957f5d 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -453,6 +453,11 @@ CIFSSMBNegotiate(unsigned int xid, struct cifs_ses *ses) } server->sec_mode = (__u8)le16_to_cpu(rsp->SecurityMode); server->maxReq = le16_to_cpu(rsp->MaxMpxCount); + /* maxReq must be at least 2, almost all servers set to 50, + some old non-MS servers set incorrectly so set to default + (otherwise oplock break e.g. would fail */ + if (server->maxReq < 2) + server->maxReq = CIFS_MAX_REQ; /* 50 */ server->maxBuf = min((__u32)le16_to_cpu(rsp->MaxBufSize), (__u32)CIFSMaxBufSize + MAX_CIFS_HDR_SIZE); server->max_vcs = le16_to_cpu(rsp->MaxNumberVcs); @@ -560,6 +565,11 @@ CIFSSMBNegotiate(unsigned int xid, struct cifs_ses *ses) /* one byte, so no need to convert this or EncryptionKeyLen from little endian */ server->maxReq = le16_to_cpu(pSMBr->MaxMpxCount); + /* maxReq must be at least 2, almost all servers set to 50, + some old non-MS servers set incorrectly so set to default + (otherwise oplock break e.g. would fail */ + if (server->maxReq < 2) + server->maxReq = CIFS_MAX_REQ; /* 50 */ /* probably no need to store and check maxvcs */ server->maxBuf = min(le32_to_cpu(pSMBr->MaxBufferSize), (__u32) CIFSMaxBufSize + MAX_CIFS_HDR_SIZE); diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index c761935..3f8cf63 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -95,7 +95,7 @@ cifs_reconnect(struct TCP_Server_Info *server) spin_unlock(&GlobalMid_Lock); server->maxBuf = 0; - cFYI(1, "Reconnecting tcp session"); + cERROR(1, "Reconnecting tcp session in flight %d", atomic_read(&server->inFlight)); /* BB FIXME */ /* before reconnecting the tcp session, mark the smb session (uid) and the tid bad so they are not used until reconnected */ diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c index 147aa22..11c6585 100644 --- a/fs/cifs/transport.c +++ b/fs/cifs/transport.c @@ -264,7 +264,12 @@ static int wait_for_free_request(struct TCP_Server_Info *server, spin_lock(&GlobalMid_Lock); while (1) { - if (atomic_read(&server->inFlight) >= cifs_max_pending) { + /* We must check that don't exceed maxReq but on SMB Negotiate + it is not initted yet (maxReq must be at least two) */ + if ((atomic_read(&server->inFlight) >= cifs_max_pending) || + ((atomic_read(&server->inFlight) >= server->maxReq) && + (server->maxReq > 1))) { spin_unlock(&GlobalMid_Lock); #ifdef CONFIG_CIFS_STATS2 atomic_inc(&server->num_waiters); On Fri, Jun 24, 2011 at 3:55 PM, Shane McColman <smccolman@xxxxxxxxxxxxxxxxxxxx> wrote: > > -----Original Message----- > From: linux-cifs-owner@xxxxxxxxxxxxxxx > [mailto:linux-cifs-owner@xxxxxxxxxxxxxxx] On Behalf Of Jeff Layton > Sent: Thursday, June 23, 2011 3:15 PM > To: Shane McColman > Cc: linux-cifs@xxxxxxxxxxxxxxx > Subject: Re: CIFS VFS errors > > On Thu, 23 Jun 2011 17:11:12 -0400 > Jeff Layton <jlayton@xxxxxxxxxx> wrote: > >> On Thu, 23 Jun 2011 14:36:53 -0600 >> "Shane McColman" <smccolman@xxxxxxxxxxxxxxxxxxxx> wrote: >> >> > Here you go. >> > >> >> >> (re-cc'ing linux-cifs) >> >> Ok, thanks. It's a little difficult to tell exactly what the problem is >> from the capture, but it looks like the server is frequently shutting >> down the connection. For instance, in frame 9: >> >> 9 19.088350 192.168.0.177 -> 192.168.0.118 TCP microsoft-ds > 59101 > [RST, ACK] Seq=1 Ack=5843 Win=0 Len=0 >> >> ...and then later in frame 109: >> >> 109 48.244414 192.168.0.177 -> 192.168.0.118 TCP microsoft-ds > 59108 > [RST, ACK] Seq=10556 Ack=18058 Win=0 Len=0 >> >> ...etc... >> >> Most of the time, the client just reestablishes the connection and >> keeps on going, but in some cases it looks like other calls are racing >> in and being sent before the session and tcon get reestablished. Here's >> one such case (some names changed to protect the data): >> >> Server shuts down connection: >> >> 579 194.337478 192.168.0.177 -> 192.168.0.118 TCP microsoft-ds > 59310 > [RST, ACK] Seq=10528 Ack=18028 Win=0 Len=0 >> >> Client reestablishes connection: >> >> 580 194.337565 192.168.0.118 -> 192.168.0.177 TCP 41098 > microsoft-ds > [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSV=177050784 TSER=0 WS=7 >> 581 194.337756 192.168.0.177 -> 192.168.0.118 TCP microsoft-ds > 41098 > [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=0 TSV=0 TSER=0 > SACK_PERM=1 >> 582 194.337768 192.168.0.118 -> 192.168.0.177 TCP 41098 > microsoft-ds > [ACK] Seq=1 Ack=1 Win=14720 Len=0 TSV=177050784 TSER=0 >> >> Negotiate request gets sent and reply is received: >> >> 583 194.337819 192.168.0.118 -> 192.168.0.177 SMB Negotiate Protocol > Request >> 584 194.341931 192.168.0.177 -> 192.168.0.118 SMB Negotiate Protocol > Response >> >> ...QPathInfo gets sent -- this shouldn't happen yet: >> >> 585 194.341953 192.168.0.118 -> 192.168.0.177 SMB Trans2 Request, > QUERY_PATH_INFO, Query File All Info, Path: \AOI Data\Nasdasdasdads8 > asdasdsd06.txt >> >> ...so the server sends back an error because there's been no session > setup: >> >> 586 194.342145 192.168.0.177 -> 192.168.0.118 SMB Trans2 Response, > QUERY_PATH_INFO, Error: Bad userid >> >> ...session and tcon get reestablished: >> >> 587 194.342159 192.168.0.118 -> 192.168.0.177 SMB Session Setup AndX > Request, User: anonymous >> 588 194.343149 192.168.0.177 -> 192.168.0.118 SMB Session Setup AndX > Response >> 589 194.343180 192.168.0.118 -> 192.168.0.177 SMB Tree Connect AndX > Request, Path: \\asdfasdf1\Aasdsd_1 (F) >> 590 194.343390 192.168.0.177 -> 192.168.0.118 SMB Tree Connect AndX > Response >> >> ...and then things work fine until the server shuts down the connection >> again. In this case, just the QPathInfo failed, but in a later case of >> this, a number of calls race in before things get set back up again and >> that's probably where you see the problems occurring. >> >> The above problem is a pretty clear condemnation of the whole >> tcpStatus handling model in cifs. The checks we have to determine >> whether something should get sent on the socket are racy. That's not a >> new bug though -- this code has been this way for a long, long time. >> >> I think though that this is at its root a server-inflicted wound. If >> you can figure out why your server is shutting down the connection, >> you'll probably fix the problems you're seeing. >> >> In the meantime, we probably need to overhaul the cifs code such that >> the above problem can't occur, but that's not trivial and probably >> isn't going to be fixed anytime soon. >> > > Ahh, I think I might see the cause. In the server's Negotiate reply: > > Max Mpx Count: 10 > > ...the client currently ignores this value from the server and uses a > module parm (!?!) to set this. What you may want to do is set the > cifs_max_pending module parm on cifs.ko to 10 and see if that makes > things behave better. > > > I verified that Windows XP only allows a max of 10 connections as set by > Microsoft. This machine already has 5 connections from the 5 pieces of > equipment here, so I'm left with 5. For this reason I set cifs_max_pending=4 > to stay on the safe side. That was at 9:30 this morning and have not had an > error since. I'll check again on Monday and give you an update. > > > -- > Jeff Layton <jlayton@xxxxxxxxxx> > -- > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Thanks, Steve -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html