On Oct 31, 2011, at 9:21 AM, David Flynn wrote: > * Chuck Lever (chuck.lever@xxxxxxxxxx) wrote: >> David, what would help immensely is if you can find a reliable way of >> reproducing this. So far we have been unable to find a reproducer. > > While i've managed to have problems with individual machines, that seem > to fail at some random point of their own choosing, the most reliable > way to produce problem for us to have a number of nodes updating various > RRD files frequently. > > Given that i haven't found a reliable and short method for reproducing > it either, if we were to re-run the known case and capture all network > traffic, would you be able to extract the relevant detail to generate a > simulation? A reproducer would be better for us [*], but I understand the arbitrary nature of the problem. A network trace would be an excellent start. Now, it would be interesting if in fact the problem occurs only when multiple clients interact with a server. In that case, capture a full network trace with snoop on your server. We'll worry about pruning the size of the trace once you have a clean capture. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com * - A reproducer allows us to perform internal-only tests at will, and it also can confirm we've got the problem properly fixed. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html