Andy Chambers <achambers@xxxxxxxx> writes: > We've just run into the dreaded "OOM Killer". I see that on Linux >> 2.6, it's recommended to turn off memory overcommit. I'm trying to > understand the implications of doing this. The interweb says this > means that forking servers can't make use of "copy on write" > semantics. Is this true? Don't know where you read that, but it's nonsense AFAIK. The actual issue here is that when a process fork()s, initially the child shares all the pages of the parent process. Over time, both the child and the parent will dirty pages that had been shared, forcing a copy-on-write to happen, after which there's a separate copy of such pages for each process. So if the parent had N pages, the ultimate memory requirement will be for something between N and 2N pages, and there's not a very good way to know in advance what it will be. Now the problem the kernel has is, what if a COW needs to happen and it has noplace to put the new page? It cannot report an ENOMEM failure because the process is not making a failable kernel call, it's just writing some memory that it has every reason to think it can write. About all the kernel can do is terminate that process, ie, OOM kill. The only way to be certain an OOM kill cannot happen is if you reserve N pages worth of memory/swap space for the child process when you do the fork (since then you can fail the fork call, if there's not that much available). You can still do COW rather than physically duplicating the whole address space right away, but you have to "bank" enough spare space to be sure there will be room when and if the time comes. "Overcommit" simply means that the kernel doesn't do such conservative advance reservation, and so it might be forced into an OOM kill. The downside of turning off overcommit is that you will have pretty severe under-utilization of your memory, since in practice a lot of a process's address space is read-only and can be shared indefinitely by parent and child. This can usually be alleviated by providing a lot of swap space that you expect won't get used. Of course, if your tuning calculations are off and the swap does start getting used a lot, performance goes to hell in a handbasket. So it's a tradeoff --- do you want to keep running but possibly slowly, or are you willing to cope with OOM kills for better average utilization of your hardware? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general