On Fri, 18 Feb 2011 12:30:04 -0600 Wayne Walker <wwalker@xxxxxxxxxxxxxxxxxxxx> wrote: > On Thu, Feb 10, 2011 at 11:14:59PM -0600, Wayne Walker wrote: > > First, I'm not certain whether this is samba, the linux cifs driver, or > > something else. > > > > During testing, one of my QA guys was running an inhouse program that > > generates pseudo-random, but fully recreatable, data and writes it to > > a file, the file is named with a name that is essentially the seed to > > the pseudo- random stream, so, given a filename, it can read the file > > and verify that the data is correct. > ... snip ... > > So, my QA guy has repeated the failure - 93 times, only from a linux box, so it appears to definitely be a cifs driver issue. > > What can I do to gather useful info? tcpdump on both client and server drop too many packets to be useful. > I asked before, but I don't think you ever gave a conclusive answer... Did the kernel report an error when you did a fsync() or close()? I suspect that it did, but sadly a lot of programs don't bother to check for that (usually because they're not really able to deal with it). > From a Linux client (hostname: acorn): > Feb 17 16:54:30 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0 > Feb 17 16:57:10 acorn kernel: CIFS VFS: No response to cmd 47 mid 46382 > Feb 17 16:57:10 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0 > Feb 17 16:57:16 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0 > Feb 17 16:57:31 acorn kernel: CIFS VFS: No response for cmd 50 mid 46388 > Feb 17 16:59:52 acorn kernel: CIFS VFS: No response to cmd 47 mid 64873 > Feb 17 16:59:52 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0 > Feb 17 16:59:53 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0 > Those mean that calls to the server were occasionally timing out. That's not terribly unusual under heavy load. Until very recently when that happened, the kernel would treat that like a hard error and would disconnect the socket. You may want to test something more recent (like 2.6.38-rc5) to see if the problems go away with that. Since you mention you're using CentOS you could also open a bug at bugzilla.redhat.com and I'll try to look at it when I get time. If you have a RH support contract you may also want to open a support case with this problem which would allow me to give it more priority. Cheers, -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html