On Wed, 19 Dec 2012 11:30:32 -0800 (PST) Tim Perry <tim.perry@xxxxxxxxxxxxxxxxxxxxxxxx> wrote: > Dear Jeff, et. al., > > > I can reproduce the problem by starting "find . -name \*.ext"and killing it when connected to either of our two Windows 2003 Servers. I can *not* reproduce it doing the same thing connected to a windows 7 box. > > $ uname -a > Linux servername 3.2.0-34-generic #53-Ubuntu SMP Thu Nov 15 10:49:02 UTC 2012 i686 i686 i386 GNU/Linux > $ cat /proc/version > > Linux version 3.2.0-34-generic (buildd@roseapple) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #53-Ubuntu SMP Thu Nov 15 10:49:02 UTC 2012 > $ lsb_release -a > No LSB modules are available. > Distributor ID: Ubuntu > Description: Ubuntu 12.04.1 LTS > Release: 12.04 > Codename: precise > > > I tried using strace but hitting ctrl-c killed strace (obviously, oops), but interestingly, this did *not* hang the file system. I will try and kill the find command (kill -9 perhaps?) and see if I can recreate the error that way. > > CONTINUING HERE: > I don't think strace on the find command will help because it isn't making the network connections. CIFS is making the network connections. Maybe I can cause the mount to happen with an strace version of CIFS? How would I do that? > > Anyhow, I opened two terminal windows and proceeded as follows: > > In terminal 1: > > $ strace find . -name \*adzzz >& ~/straceFind.txt > > > In terminal 2: > $ ps aux | grep find | grep -v strace > perry 2583 12.6 0.0 4792 1088 pts/5 R+ 11:27 0:00 find . -name *adzzz > perry 2585 0.0 0.0 4388 828 pts/2 S+ 11:27 0:00 grep find > $ kill -9 2583 > > File system dies. > > I've attaced the straceFind.txt, but it just shows find walking the filesystem tree: > statat64(AT_FDCWD, "0010", {st_mode=S_IFDIR|0777, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0 > openat(AT_FDCWD, "0010", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 5 > fchdir(5) = 0 > getdents64(5, /* 14 entries */, 32768) = 448 > getdents64(5, /* 0 entries */, 32768) = 0 > close(5) = 0 > fstatat64(AT_FDCWD, "_vti_cnf", {st_mode=S_IFDIR|0777, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0 > openat(AT_FDCWD, "_vti_cnf", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 5 > fchdir(5) = 0 > getdents64(5, /* 13 entries */, 32768) = 416 > getdents64(5, /* 0 entries */, 32768) = 0 > close(5) = 0 > open("..", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_NOFOLLOW) = 5 > fstat64(5, {st_mode=S_IFDIR|0777, st_size=0, ...}) = 0 > fchdir( > > > Ideas? > That kernel is pretty old, so you may want to try a more recent one. You may first want to start by tracing with wireshark -- see what's happening on the wire before and after the signal is delivered. If it works against win7 then it's likely that win7 disconnects the socket when the signatures are wrong. With that, we'd reestablish the connection and things would start working again. I suspect that win2k8 just starts returning an error that we map to -EACCES. It's possible that we should disconnect the client when the signatures start looking wrong, but I think we need to understand why signals are causing this issue in the first place. There are some places where we do interruptible sleeps (vs. killable ones). It's possible that SIGINT (which is what ^c generally delivers) is causing havok there. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html