Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD

Scott Carey <scott@xxxxxxxxxxxxxxxxx> · Wed, 11 Aug 2010 16:52:26 -0700

On Aug 10, 2010, at 11:38 AM, Karl Denninger wrote:

Scott Marlowe wrote:

  On Tue, Aug 10, 2010 at 12:13 PM, Karl Denninger <karl@xxxxxxxxxxxxx> wrote:

    ANY disk that says "write is complete" when it really is not is entirely
unsuitable for ANY real database use.  It is simply a matter of time

What about read only slaves where there's a master with 100+spinning
hard drives "getting it right" and you need a half dozen or so read
slaves?  I can imagine that being ok, as long as you don't restart a
server after a crash without checking on it.

A read-only slave isn't read-only, is it?

I mean, c'mon - how does the data get there?

IF you mean "a server that only accepts SELECTs, does not accept
UPDATEs or INSERTs, and on a crash **reloads the entire database from
the master**", then ok.

"ENTIRE database"?

Depends on your tablespace setup and schema usage pattern.  
If:
* 90% of your data tables are partitioned by date, and untouched a week after insert.  Partitions are backed up incrementally.
* The remaining 10% of it is backed up daily, and of that 9% can be re-generated from data elsewhere if data is lost.
* System catalog and wal are on 'safest of safe' hardware.

Then your 'bulk' data on a slave can be on less than flawless hardware.  Simply restore the tables from the last week from the master or backup when the (rare) power failure occurs.  The remaining data is safe, since it is not written to.
Split up your 10% of non-date partitioned data into what needs to be on safe hardware and what does not (maybe some indexes, etc).

Most of the time, the incremental cost of getting a BBU is too small to not do it, so the above hardly applies.  But if you have data that is known to be read-only, you can do many unconventional things with it safely.

Most people who will do this won't reload it after a crash.  They'll
"inspect" the database and say "ok", and put it back online.  Bad Karma
will ensue in the future.

Anyone going with something unconventional better know what they are doing and not just blindly plug it in and think everything will be OK.  I'd never recommend unconventional setups for a user that wasn't an expert and understood the tradeoff.

Incidentally, that risk is not theoretical either (I know about this
one from hard experience.  Fortunately the master was still ok and I
was able to force a full-table copy.... I didn't like it as the
database was a few hundred GB, but I had no choice.)

Been there with 10TB with hardware that should have been perfectly safe.  5 days of copying, and wishing that pg_dump supported lzo compression so that the dump portion had a chance at keeping up with the much faster restore portion with some level of compression on to save the copy bandwidth.

-- Karl

<karl.vcf>