On Fri, May 02, 2008 at 05:46:53PM +0300, Volkan YAZICI wrote: > In our current structure, responsiveness has the > highest priority and thus it is ok for us to cancel queries at that > instant and re-initiate connections. To achieve this effect, I started > to turn swap space off on some of the servers and turned > vm.oom_kill_allocating_task kernel parameter on. (Periodical postgres > process availability checks decides whether there is a need to fire up a > fresh postgres instance.) So far, this method worked pretty well but I'm > suspicious about data corruptions. (Disks configurations are set to RAID > 10.) What are the downsides of such a design scheme? One big problem is that the OOM killer will quite possibly decide to kill the postmaster daemon process as opposed to any children. The children don't necessarily die in that case. If you start up a new postmaster at this point, you will corrupt your data almost certainly. Why are you allowing memory overcommit at all? And what is causing you to swap? I think those are the things you need to fix. A -- Andrew Sullivan ajs@xxxxxxxxxxxxxxxxx +1 503 667 4564 x104 http://www.commandprompt.com/