On Thu, 28 Aug 2008, Jeff Davis wrote:
The problem for the postmaster is that the OOM killer counts the children's total vmsize -- including *shared* memory -- against the parent, which is such a bad idea I don't know where to start. If you have shared_buffers set to 1GB and 25 connections, the postmaster will be penalized as though it was using 13.5 GB of memory, even though all the processes together are only using about 1GB!
I find it really hard to believe that it counts shared memory like that. That's just dumb.
Of course, there are two types of "shared" memory. There's explicit shared memory, like Postgres uses, and there's copy-on-write "shared" memory, caused by a process fork. The copy-on-write memory needs to be counted for each child, but the explicit shared memory needs to be counted just once.
Not only that, killing a process doesn't free shared memory, so it's just flat out broken.
Exactly. a cost-benefit model would work well here. Work out how much RAM would be freed by killing a process, and use that when choosing which process to kill.
Matthew -- You will see this is a 3-blackboard lecture. This is the closest you are going to get from me to high-tech teaching aids. Hey, if they put nooses on this, it would be fun! -- Computer Science Lecturer