On Thu, May 29, 2014 at 11:02 AM, Denys Sobchyshak <denys.sobchyshak@xxxxxxxxx> wrote: > Hi cifs community, > > Problem: periodically (meaning that I don't know how to reproduce it) > mounted windows share becomes inaccessible i.e. a simple ls -l command > takes hours to output anything (normally it outputs the contents in > the end though). > > Environment: MS Hyper-V with Server 2012 as a host facilitating > communication between CentOS 6.4 and MS Server 2012 guests (everything > 64-bit). On windows the folder was marked as public share. On CentOS > cift-utils was installed and fstab entry looks as follows: > > //192.168.178.202/share /mnt/share cifs > uid=504,username=myuser,dom=mydomain,password=mypassword,iocharset=utf8,noperm,ro > 0 0 > > Note: Parallel to it there's also a network attached storage mounted > with linux installed on it and has never failed me even with enabled > suspend and hibernate modes. Also I can't find them now, but I've > noticed some warnings in centOS logs saying that it failed to open a > socket or something alike. Also I've asked this question before and > found a workaround which doesn't help anymore. > http://superuser.com/questions/678855/windows-share-is-not-accessible-from-time-to-time > > Question: since I'm not much of a network guy I can't find where the > problem is located and am not even sure how to look for it so I would > appreciate any advises on how to diagnose the problem and/or identify > the source of error. Apart from that I'm wondering if this is a known > issue and how one can resolve it. Coupe quick thoughts on this: If a server doesn't respond, or network goes down, generally the linux cifs client will disconnect then reconnect automatically transparently and would be harmless but how and when the client does this has changed. Initially the cifs client was designed with the following reconnect logic: 1) For anything other than a file write request (or blocking lock request), if the server doesn't respond (respond within default timeout, which was well under a minute) then disconnect the socket and reconnect 2) For a write request use a much longer timeout, and for a write request beyond end of file (which could take hours if you picked a really big starting offset) would never time out. The logic was changed (after RHEL6, but the RedHat guys probably have backported it, at least to the most recent SP) to 1) if a request has taken more than about a 30 seconds then send an SMBEcho request. 2) if the server does not respond to a few echo requests then kill the tcp session and reconnect The advantage of the newer behavior (which was added a few years ago) is that it handles the case where a slow request (opening an offline file on tape drive for example) does not cause an otherwise healthy server to appear to be dead - so the chance of disconnecting to a "healthy" server goes way down since we won't disconnect from a server which is still responding to "SMBecho" requests. The workaround you pointed to of doing a cron job to periodically do something trivial on the mount prevents the server from autodisconnecting the socket (some servers autodisconnect inactive connections, with no active files) - although reconnecting should be harmless and transparent even in that case (except for cases where your kerberos credentials have expired and can't be reacquired or where password changed on the server) -- Thanks, Steve -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html