Andreas Rieke <andreas.rieke@xxxxxx> writes: > I am the guy who posted the problem to mod_perl, and yes, I am quite > sure that we are talking about the right numbers. The best argument is > that the machine in fact starts swapping when memory is gone - and this > means there is neither free nor cached memory left. Andreas, what it sounds like to me is a kernel memory leak probably triggered by Postgres' use of SysV shared memory (which is not a heavily used kernel feature these days, so bugs in it are hardly out of the question). A couple of facts that might help you narrow your theories: 1. When the postmaster starts up, it allocates one, count 'em one, shared memory segment that is never thereafter changed in size. 2. When the postmaster shuts down, it issues a shmctl(IPC_RMID) call against that segment. The kernel should thereupon mark the segment for destruction, and then actually destroy it when the last process connected to it is gone. In a normal shutdown that would mean immediately (because the postmaster waits for all its child processes to die first), but in an "immediate mode" shutdown there might still be children alive at the instant of the shmctl. Within this context, the only way to cause a memory leak is to "kill -9" the postmaster instead of giving it a chance to exit gracefully. In that case the shmctl(IPC_RMID) never happens and the memory segment isn't reclaimed. However, if that were your problem then the evidence would be real clear in "ipcs -m -a" output: lots of postgres-owned segments with zero attached processes. (There actually is code in the postmaster to try to find and destroy such orphaned segments during postmaster restart, but it's not 100% guaranteed to find everything.) If the shared segment is no longer present according to ipcs, and there are no postgres processes still running, then it's simply not possible for it to be postgres' fault if memory has not been reclaimed. So you're looking at a kernel bug. As to the nature of the bug ... we saw something similar in older versions of OS X: http://archives.postgresql.org/pgsql-general/2004-08/msg00972.php Since Darwin is BSD-derived, an ancient common bug seems possible. (BTW, I just repeated the above experiment in OS X 10.4.8, and see no leak, so Apple did fix it somewhere along the line.) Anyway I'd suggest trying to duplicate the problem without apache by firing new backends rapidly as in the above message. If you can, file a kernel bug report. regards, tom lane