On Wed, 20 Aug 2008, Glen Parker wrote:
Greg Smith wrote:
2) If there is a problem with the network drive, the slow/blocked network
write is not happening where the database is waiting for it.
This is not really a consideration, is it? I was under the impression that
no other database activity blocks waiting for the archiver.
The database will continue accumulating WAL segments it can't recycle if
the archiver keeps failing, which can cause the size of the pg_xlog
directory (often mounted into a separate, smaller partition or disk) to
increase dramatically. You do not want to be the guy who caused the
database to go down because the xlog disk filled after some network mount
flaked out. I've seen that way too many times in WAN environments where
the remote location was unreachable for days, due to natural disaster for
example, and since under normal operation pg_xlog never got very big it
wasn't sized for that.
It will also slow things down a bit under heavy write loads, as every
segment change will result in creating a new segment file rather than
re-using an old one.
You also don't want to be the guy who has to explain why the database is
taking hours to come back up again after it crashed and has 4000 WAL
segments to replay, because archiving failed for a long time and prevented
proper checkpoints (ask Robert Treat if you don't believe me, he also once
was that guy). While occasional archiving failures are tolerated just
fine and you can catchup later, a design that presumes it's OK for them to
happen for extended periods is not optimal. A crash after an excess of
segments have accumulated can result in a large amount of replay downtime
for the server, and in some environments (where the primary is very
redundant for example) the risk of that is worse than saying your PITR
backup is hosed and just throwing away the segments if you can't transfer
them to the backup for some reason.
3) In an emergency where the usual network drive is unavailable, you can
temporarily change your archive strategy without touching the database
server's configuration.
That's just as true with the direct write method. You simply change the
archive script to do nothing and return a failed exit status. The DBMS will
keep trying until you change it back, and then everything catches up and
proceeds as normal.
I've already been through why assuming archive failures have zero cost can
be bad above. There are some additional issues here as well.
When's the last time you tried to get a DBA to send a "kill -HUP" to a
production database to get it to recognize that archive_command was
changed in the postgresql.conf? Sure, we all know it's not supposed to
ever hurt the server, but Murphy is a tricky character and lots of people
are (rightly) paranoid. I think we've all known admins whose fingers were
trained to only ever type "kill -9" no matter what signal they meant to
send.
OK, maybe you're smarter than that and used a separate script. DBAs are
also not happy changing a script that gets called by the database every
couple of minutes, and as soon as there's more than one piece involved it
can be difficult to do an atomic update of said script.
Anytime you can decouple a failure recovery cleanup (like a network drive
getting disconnected screwing up archiving) completely away from the
routine database operation, you have gained flexibility in how easily and
safely you can recover from that failure. I assure you that all the
suggestions I made come from painful and completely unexpected messes
--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD