On Thu, Jun 19, 2008 at 10:53:28AM -0500, Weathers, Norman R. wrote: > The kernel that we were really seeing the problem with was 2.6.25.4, but > I think we may have figured out the 4096 problem, and it was probably a > mistake on my part, but it is important for the NFS users to see it so > they don't make the same mistake. I had found some performance tuning > guides, and in trying some of the suggestions, found that the setting > changes did seem to help on some things, but of course I never got to > run a check under full load (800 + clients). A suggestion was to change > the tcp_reordering tunable under /proc/sys/net/ipv4 from the default 3 > to 127. We think that this was actually causing the issue. I was able > to trace back through all of the changes, and I changed this setting > back to the default 3, and it immediately fixed the size-4096 hell. It > appears that the reordering just eats into the memory, especially in > high demand situations, and I guess that should make perfect sense if we > are actually buffering up packets for reorder, and we are slamming the > box with thousands of requests per minute. OK, sounds plausible, though I won't pretend to understand exactly how that reordering code is using memory. > We still have other performance issues now, but it appears to be more of > a bottleneck, the nodes do not appear to be backing off when the servers > are becoming congested. ... > > So with that many clients all making requests to the server at once, > > we'd start hitting that (serv->sv_nrthreads+3)*20 limit when > > the number > > of threads was set to less than 30-50. That doesn't seem to be the > > point where you're seeing a change in behavior, though. > > > > We were estimating between 40 and 50 threads was the cut off for being > able to service all of the (current) requests at once. I haven't ramped > back up to that level yet. I wasn't comfortable yet with letting it all > hang back out just in case we get into that hellish mode again, it can > be a pain to try and get into those systems once they are overloaded > (even over serial, sometimes it can just timeout the login). We had to > actually bring online a second option to help alleviate some of the back > congestion because the servers couldn't handle the workload. Thanks for the update, and let us know if you figure out anything more. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html