Re: Dumping a database that is not accepting commands?

Kevin Grittner <kgrittn@xxxxxxxxx> · Tue, 17 Sep 2013 13:46:17 -0700 (PDT)

Natalie Wenz <nataliewenz@xxxxxxxxxxx> wrote:

> Sorry; my description of what is going on was a little unclear.
> We didn't upgrade the existing database. We moved it to different
> hardware, and just created a brand new database to accept the
> data that had been backing up in sqlite files while our original
> database was offline. I'm still dealing with the wraparound on
> the original, just on a different machine.

OK, to restate to be sure I understand, the original database is
being vacuumed in a new location, and a new 9.3 database in the
original location has absorbed the queued data?  Once you complete
the vacuum, you will copy the old data back to the new database at
the old location?

> autovacuum_freeze_max_age       | 800000000

> Normally the autovacuum seem to keep up sufficiently. We got into
> trouble with a bad combination of not-typical long-running
> queries, a disk failure and subsequent zpool repair, and the only
> person who checks the log files regularly (me) was out of the
> office. This has been so painful and slow to recover from, I
> don't think we'll ever get into this mess again. (At least not
> quite like this. I seem to have a knack for finding *new* ways to
> break things.)

A perfect storm of events, eh?  It's hard to have things always go
smoothly in the face of such events, but I see a couple things you
might want to consider.  Increasing autovacuum_freeze_max_age
reduces the amount of time you have to get back on track.  You
might want to take that back down to the default.  There was a bug
causing wraparound prevention autovacuums to trigger too
frequently, which is now fixed in the latest minor releases, so
making that chnage might not be as painful as you expect.  Make
sure you are monitoring for long-running transactions, so you don't
get burned by one that is accidental.  Also, if you have a
processing cycle where there are off-peak hours on a daily or
weekly basis, you might want to run a VACUUM ANALYZE command
durning those windows, to get some of the freezing done before it
is critical.

> I will also take this opportunity to mention again that if anyone
> is considering a making a patch for 64-bit xids, you would make
> at least one small group of people very, very happy.  :)

While 64-bit xids isn't likely, there is some work taking a more
creatie approach to the issue which might make you even happier in
a year or so.  :-)

> Side question: some of the longer queries that were running when
> the database yakked were deletes of old data that had been
> manually migrated to a new table with a more appropriate format.
> We were running out of disk space, so we were trying to clear up
> some space by removing data we now had in two places. Does a
> delete of previously-frozen rows unfreeze them, or anything like
> that? Because in a series of maybe a dozen queries or so, we
> deleted billions of rows. Does that generate a significant amount
> of extra work for the autovacuumer? 

The pages which had tuples deleted would need to be cleaned up by
vacuum, and rewritten.  It would also remove all index entries for
all deleted rows.  It might also scan backward from the end of the
table to release space to the OS.  That could conceivably be enough
avoidable work to make your idea of copying out the remaining data
feasible.  What percentage of the rows were deleted?  Could your
copy out be to a separate set of drives?

If you know at the time a row is added what group it will be in for
deletion, it might pay to move to partitioning, so that a group of
rows could be deleted pretty much as fast as you can drop a table.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin