Re: select on 22 GB table causes "An I/O error occured while sending to the backend." exception

david@xxxxxxx · Thu, 28 Aug 2008 09:48:23 -0700 (PDT)

On Thu, 28 Aug 2008, Craig James wrote:

david@xxxxxxx wrote:
On Wed, 27 Aug 2008, Craig James wrote:
The OOM killer is a terrible idea for any serious database server.  I 
wrote a detailed technical paper on this almost 15 years ago when Silicon 
Graphics had this same feature, and Oracle and other critical server 
processes couldn't be made reliable.

The problem with "overallocating memory" as Linux does by default is that 
EVERY application, no matter how well designed and written, becomes 
unreliable: It can be killed because of some OTHER process.  You can be as 
clever as you like, and do all the QA possible, and demonstrate that there 
isn't a single bug in Postgres, and it will STILL be unreliable if you run 
it on a Linux system that allows overcommitted memory.

IMHO, all Postgres servers should run with memory-overcommit disabled.  On 
Linux, that means  /proc/sys/vm/overcommit_memory=2.

it depends on how much stuff you allow others to run on the box. if you 
have no control of that then yes, the box is unreliable (but it's not just 
becouse of the OOM killer, it's becouse those other users can eat up all 
the other box resources as well CPU, network bandwidth, disk bandwidth, 
etc)

even with overcommit disabled, the only way you can be sure that a program 
will not fail is to make sure that it never needs to allocate memory. with 
overcommit off you could have one program that eats up 100% of your ram 
without failing (handling the error on memory allocation such that it 
doesn't crash), but which will cause _every_ other program on the system to 
fail, including any scripts (becouse every command executed will require 
forking and without overcommit that will require allocating the total 
memory that your shell has allocated so that it can run a trivial command 
(like ps or kill that you are trying to use to fix the problem)

if you have a box with unpredictable memory use, disabling overcommit will 
not make it reliable. it may make it less unreliable (the fact that the 
linux OOM killer will pick one of the worst possible processes to kill is a 
problem), but less unreliable is not the same as reliable.

The problem with any argument in favor of memory overcommit and OOM is that 
there is a MUCH better, and simpler, solution.  Buy a really big disk, say a 
terabyte, and allocate the whole thing as swap space.  Then do a decent job 
of configuring your kernel so that any reasonable process can allocate huge 
chunks of memory that it will never use, but can't use the whole terrabyte.

Using real swap space instead of overallocated memory is a much better 
solution.

- It's cheap.

cheap in dollars, if you actually use any of it it's very expensive in 
performance

- There is no performance hit at all if you buy enough real memory
- If runaway processes start actually using memory, the system slows
down, but server processes like Postgres *aren't killed*.
- When a runaway process starts everybody swapping, you can just
find it and kill it.  Once it's dead, everything else goes back
to normal.

all of these things are still true if you enable overcommit, the 
difference is that with overcommit enabled your actual ram will be used 
for cache as much as possible, with overcommit disabled you will keep 
throwing away cache to make room for memory that's allocated but not 
written to.

I generally allocate 2G of disk to swap, if the system ends up using even 
that much it will have slowed to a crawl, but if you are worried that 
that's no enough, by all means go ahead and allocate more, but allocateing 
a 1TB disk is overkill (do you realize how long it takes just to _read_ an 
entire 1TB disk? try it sometime with dd if=/dev/drive of=/dev/null)

David Lang

It's hard to imagine a situation where any program or collection of programs 
would actually try to allocate more than a terrabyte of memory and exceed the 
swap space on a single terrabyte disk.  The cost is almost nothing, a few 
hundred dollars.

So turn off overcommit, and buy an extra disk if you actually need a lot of 
"virtual memory".

Craig