On Wed, 21 Oct 2015, krichy@xxxxxxxxxxxx wrote: > Dear devs, > > We have an nfs lockup issue. We run a ganeti cluster consisting of 7 debian > linux nodes and 1 freenas for hosting the vm images. The images are exported > via nfsv3. The problem is that randomly we end in a livelock on one of our > nodes. > > That means the nfs share is alive, we can list directories, files, even can > read files (very slow, see later). And even can write to files, but the file > close operation does not return, it gets blocked. > > The read is slow in that way that while copying a file from the share to /tmp, > the data arrives very fast to the node, but in /tmp it accumulates slowly. > > I've also opened a debian bug report on it, but I think it is not related to > debian (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=801924). > > The only way is to reboot machine, with all the vm's running on it getting > interrupted. > > I've captured each tasks' stack trace, hopefully it helps someone to find out > the issue. > > Meanwhile the other 6 nodes can access the nfs share right, so I think this is > not a networking or server issue. Restarting the nfs server on the server side > still does not have any effect, not recovering. The nfs tcp connection is > established, listing files works again, but writes not. > > Some information of the nodes: > # uname -a > Linux host 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u4 (2015-09-19) > x86_64 GNU/Linux > > They have 1.5G ram allocated to dom0, that should be enough. > > I know this information is little information, give me advice what to look for > next time. Unfortunately I dont know how to reproduce it. > > Thanks in advance, > > Kojedzinszky Richard > Euronet Magyarorszag Informatika Zrt. I took a look at your debian bug report.. what's up with those drbd procs? Are you writing to drbd-backed devs, and have you made sure that's not involved in any way? Ben -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html