Ben, Thanks - this may be a very rare case - hard to prove without your testing but it looks like Jeff's patch makes sense. On Mon, Jun 6, 2011 at 12:22 PM, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote: > On 06/06/2011 09:51 AM, Jeff Layton wrote: >> >> On Mon, 06 Jun 2011 09:47:40 -0700 >> Ben Greear<greearb@xxxxxxxxxxxxxxx> wrote: >> >>> On 06/06/2011 06:45 AM, Jeff Layton wrote: >>>> >>>> On Sat, 4 Jun 2011 07:19:23 -0400 > >>>> [PATCH] cifs: don't allow cifs_reconnect to exit with NULL socket >>>> pointer >>>> >>>> It's possible for the following set of events to happen: >>>> >>>> cifsd calls cifs_reconnect which reconnects the socket. A userspace >>>> process then calls cifs_negotiate_protocol to handle the NEGOTIATE and >>>> gets a reply. But, while processing the reply, cifsd calls >>>> cifs_reconnect again. Eventually the GlobalMid_Lock is dropped and the >>>> reply from the earlier NEGOTIATE completes and the tcpStatus is set to >>>> CifsGood. cifs_reconnect then goes through and closes the socket and >>>> sets the >>>> pointer to zero, but because the status is now CifsGood, the new socket >>>> is not created and cifs_reconnect exits with the socket pointer set to >>>> NULL. >>>> >>>> Fix this by only setting the tcpStatus to CifsGood if the tcpStatus is >>>> CifsNeedNegotiate, and by making sure that generic_ip_connect is always >>>> called at least once in cifs_reconnect. >>>> >>>> Note that this is not a perfect fix for this issue. It's still possible >>>> that the NEGOTIATE reply is handled after the socket has been closed and >>>> reconnected. In that case, the socket state will look correct but it no >>>> NEGOTIATE was performed on it. In that situation though the server >>>> should just shut down the socket on the next attempted send, rather >>>> than causing the oops that occurs today. >>>> >>>> Reported-by: Ben Greear<greearb@xxxxxxxxxxxxxxx> >>>> Signed-off-by: Jeff Layton<jlayton@xxxxxxxxxx> >>>> --- >>>> fs/cifs/connect.c | 6 +++--- >>>> 1 files changed, 3 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c >>>> index 84c7307..8bb55bc 100644 >>>> --- a/fs/cifs/connect.c >>>> +++ b/fs/cifs/connect.c >>>> @@ -152,7 +152,7 @@ cifs_reconnect(struct TCP_Server_Info *server) >>>> mid_entry->callback(mid_entry); >>>> } >>>> >>>> - while (server->tcpStatus == CifsNeedReconnect) { >>>> + do { >>>> try_to_freeze(); >>>> >>>> /* we should try only the port we connected to before */ >>>> @@ -167,7 +167,7 @@ cifs_reconnect(struct TCP_Server_Info *server) >>>> server->tcpStatus = CifsNeedNegotiate; >>>> spin_unlock(&GlobalMid_Lock); >>>> } >>>> - } >>>> + } while (server->tcpStatus == CifsNeedReconnect); >>>> >>>> return rc; >>>> } >>>> @@ -3371,7 +3371,7 @@ int cifs_negotiate_protocol(unsigned int xid, >>>> struct cifs_ses *ses) >>>> } >>>> if (rc == 0) { >>>> spin_lock(&GlobalMid_Lock); >>>> - if (server->tcpStatus != CifsExiting) >>>> + if (server->tcpStatus == CifsNeedNegotiate) >>>> server->tcpStatus = CifsGood; >>>> else >>>> rc = -EHOSTDOWN; >>> >>> >>> This has some merge issues on 3.6.38.8: >>> >>> >>> <<<<<<< >>> while ((server->tcpStatus != CifsExiting)&& >>> (server->tcpStatus != CifsGood)) { >>> ======= >>> do { >>> >>>>>>> >>> >>> Should I keep your comparison for tcpStatus == CifsNeedReconnect >>> instead of these != comparisons above? >>> >>> >>> Thanks, >>> Ben >>> >> >> No, I think you probably just want to take patch fd88ce9313 too, which >> should fix up the merge conflict. > > Ok, I've applied those two..we'll start testing with these patches > today. Might take a while before we are certain the fix works, as > this isn't usually easy or fast to reproduce. > > Thanks, > Ben > > -- > Ben Greear <greearb@xxxxxxxxxxxxxxx> > Candela Technologies Inc http://www.candelatech.com > > -- Thanks, Steve -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html