On Fri, Jul 10, 2009 at 06:34:08PM -0400, bfields wrote: > On Thu, Jul 09, 2009 at 10:27:39AM -0700, Simon Kirby wrote: > > Hello, > > > > It seems this email to Greg Banks is bouncing (no longer works at SGI), > > Yes, I've cc'd his new address. (But he's on vacation.) > > > and I see git commit 59a252ff8c0f2fa32c896f69d56ae33e641ce7ad is still > > in HEAD (and still causing problems for our load). > > > > Can somebody else eyeball this, please? I don't understand enough about > > this particular change to fix the request latency / queue backlogging > > that this patch seems to introduce. > > > > It would seem to me that this patch is flawed because svc_xprt_enqueue() > > is edge-triggered upon the arrival of packets, but the NFS threads > > themselves cannot then pull another request off of the socket queue. > > This patch likely helps with the particular benchmark, but not in our > > load case where there is a heavy mix of cached and uncached NFS requests. > > That sounds plausible. I'll need to take some time to look at it. Looking back at that commit--I'm now confused about how it was meant to work. In the case where the woken-up thread is waiting inside of svc_recv(), ->nwaking doesn't get decremented at all until the request is processed and svc_recv() is called again--effectively limiting the number of concurrent requests to 5 per pool, so, if I read the code correctly, likely to cause problems if your workload would benefit from lots of requests being able to wait on io simultaneously (e.g. if you have a large working set and more than 5 spindles per pool). The nwaking count probably also leaks in some cases (e.g. if a thread exits?) I'm inclined to revert the patch and take another look at Greg's original problem. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html