Re: select on 22 GB table causes "An I/O error occured while sending to the backend." exception

david@xxxxxxx · Wed, 27 Aug 2008 23:23:16 -0700 (PDT)

On Thu, 28 Aug 2008, Tom Lane wrote:

david@xxxxxxx writes:
On Wed, 27 Aug 2008, Andrew Sullivan wrote:
The upshot of this is that postgres tends to be a big target for the
OOM killer, with seriously bad effects to your database.  So for good
Postgres operation, you want to run on a machine with the OOM killer
disabled.

I disagree with you.

Actually, the problem with Linux' OOM killer is that it
*disproportionately targets the PG postmaster*, on the basis not of
memory that the postmaster is using but of memory its child processes
are using.  This was discussed in the PG archives a few months ago;
I'm too lazy to search for the link right now, but the details and links
to confirming kernel documentation are in our archives.

This is one hundred percent antithetical to the basic design philosophy
of Postgres, which is that no matter how badly the child processes screw
up, the postmaster should live to fight another day.  The postmaster
basically exists to restart things after children die ungracefully.
If the OOM killer takes out the postmaster itself (rather than the child
that was actually eating the unreasonable amount of memory), we have no
chance of recovering.

So, if you want a PG installation that is as robust as it's designed to
be, you *will* turn off Linux' OOM killer.  Otherwise, don't complain to
us when your database unexpectedly stops responding.

(Alternatively, if you know how an unprivileged userland process can
defend itself against such exceedingly brain-dead kernel policy, we are
all ears.)

there are periodic flamefests on the kernel mailing list over the OOM 
killer, if you can propose a better algorithm for it to use than the 
current one that doesn't end up being just as bad for some other workload 
the kernel policy can be changed.

IIRC the reason why it targets the parent process is to deal with a 
fork-bomb type of failure where a program doesn't use much memory itself, 
but forks off memory hogs as quickly as it can. if the OOM killer only 
kills the children the problem never gets solved.

I assume that the postmaster process is monitoring the back-end processes 
by being it's parent, is there another way that this monitoring could 
be done so that the back-end processes become independant of the 
monitoring tool after they are started (the equivalent of nohup)?

while this approach to monitoring may not be as quick to react as a wait 
for a child exit, it may be worth doing if it makes the postmaster not be 
the prime target of the OOM killer when things go bad on the system.

			regards, tom lane

PS: I think this is probably unrelated to the OP's problem, since he
stated there was no sign of any problem from the database server's
side.

agreed.

David Lang