Re: select on 22 GB table causes "An I/O error occured while sending to the backend." exception

"Scott Marlowe" <scott.marlowe@xxxxxxxxx> · Thu, 28 Aug 2008 20:01:53 -0600

On Thu, Aug 28, 2008 at 7:53 PM, Matthew Dennis <mdennis@xxxxxxxxxx> wrote:
> On Thu, Aug 28, 2008 at 8:11 PM, Scott Marlowe <scott.marlowe@xxxxxxxxx>
> wrote:
>>
>> > wait a min here, postgres is supposed to be able to survive a complete
>> > box
>> > failure without corrupting the database, if killing a process can
>> > corrupt
>> > the database it sounds like a major problem.
>>
>> Yes it is a major problem, but not with postgresql.  It's a major
>> problem with the linux OOM killer killing processes that should not be
>> killed.
>>
>> Would it be postgresql's fault if it corrupted data because my machine
>> had bad memory?  Or a bad hard drive?  This is the same kind of
>> failure.  The postmaster should never be killed.  It's the one thing
>> holding it all together.
>
> I fail to see the difference between the OOM killing it and the power going
> out.

Then you fail to understand.

scenario 1:  There's a postmaster, it owns all the child processes.
It gets killed.  The Postmaster gets restarted.  Since there isn't one
running, it comes up.  starts new child processes.  Meanwhile, the old
child processes that don't belong to it are busy writing to the data
store.  Instant corruption.

scenario 2: Someone pulls the plug.  Every postgres child dies a quick
death.  Data on the drives is coherent and recoverable.
>>  And yes, if the power went out and PG came up with a corrupted DB
> (assuming I didn't turn off fsync, etc) I *would* blame PG.

Then you might be wrong.  If you were using the LVM, or certain levels
of SW RAID, or a RAID controller with cache with no battery backing
that is set to write-back, or if you were using an IDE or SATA drive /
controller that didn't support write barriers, or using NFS mounts for
database storage, and so on.  My point being that PostgreSQL HAS to
make certain assumptions about its environment that it simply cannot
directly control or test for.  Not having the postmaster shot in the
head while the children keep running is one of those things.

>  I understand
> that killing the postmaster could stop all useful PG work, that it could
> cause it to stop responding to clients, that it could even "crash" PG, et
> ceterabut if a particular process dying causes corrupted DBs, that sounds
> borked to me.

Well, design a better method and implement it.  If everything went
through the postmaster you'd be lucky to get 100 transactions per
second.  There are compromises between performance and reliability
under fire that have to be made.  It is not unreasonable to assume
that your OS is not going to randomly kill off processes because of a
dodgy VM implementation quirk.

P.s. I'm a big fan of linux, and I run my dbs on it.  But I turn off
overcommit and make a few other adjustments to make sure my database
is safe.  The OOM killer as a default is fine for workstations, but
it's an insane setting for servers, much like swappiness=60 is an
insane setting for a server too.