2009/5/28 Greg Smith <gsmith@xxxxxxxxxxxxx>
In this application is not closing the connection, the development team is makeing the change for close the connection after getting the job done. So most connections are in idle state. How much would this help? Does this could be the real problem?
ok, I'll test if updating the kernel this improves
On Thu, 28 May 2009, Flavio Henrique Araque Gurgel wrote:It would help if you gave more specific information about what you're talking about. I know there was a bunch of back and forth on the "kswapd should only wait on IO if there is IO" patch, where it was commited and then reverted etc, but it's not clear to me if that's what you're talking about--and if so, what that has to do with the context switch problem.
It is 2.6.24 We had to apply the kswapd patch also. It's important specially if you see your system % going as high as 99% in top and loosing the machine's control. I have read something about 2.6.28 had this patch accepted in mainstream.
Back to Fabrix's problem. You're fighting a couple of losing battles here. Let's go over the initial list:
1) You have 32 cores. You think they should be allowed to schedule
3500 active connections across them. That doesn't work, and what happensis exactly the sort of context switch storm you're showing data for. Think about it for a minute: how many of those can really be doing work at any time? 32, that's how many. Now, you need some multiple of the number of cores to try to make sure everybody is always busy, but that multiple should be closer to 10X the number of cores rather than 100X. You need to adjust the connection pool ratio so that the PostgreSQL max_connections is closer to 500 than 5000, and this is by far the most critical thing for you to do. The PostgreSQL connection handler is known to be bad at handling high connection loads compared to the popular pooling projects, so you really shouldn't throw this problem at it. While kernel problems stack on top of that, you really shouldn't start at kernel fixes; nail the really fundamental and obvious problem first.
In this application is not closing the connection, the development team is makeing the change for close the connection after getting the job done. So most connections are in idle state. How much would this help? Does this could be the real problem?
2) You have very new hardware and a very old kernel. Once you've done the above, if you're still not happy with performance, at that point you should consider using a newer one. It's fairly simple to build a Linux kernel using the same basic kernel parameters as the stock RedHat one. 2.6.28 is six months old now, is up to 2.6.28.10, and has gotten a lot more testing than most kernels due to it being the Ubuntu 9.04 default. I'd suggest you try out that version.
ok, I'll test if updating the kernel this improves
3) A system with 128GB of RAM is in a funny place where by using the defaults or the usual rules of thumb for a lot of parameters ("set shared_buffers to 1/4 of RAM") are all bad ideas. shared_buffers seems to top out its usefulness around 10GB on current generation hardware/software, and some Linux memory tunables have defaults on 2.6.18 that are insane for your system; vm_dirty_ratio at 40 comes to mind as the one I run into most. Some of that gets fixed just by moving to a newer kernel, some doesn't. Again, these aren't the problems you're having now though; they're the ones you'll have in the future *if* you fix the more fundamental problems first.
--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD