Re: vacuumdb question/problem

Bob Lunney <bob_lunney@xxxxxxxxx> · Thu, 21 Jul 2011 12:55:26 -0700 (PDT)

Dave,

You're on the right track now, however, unloading the table to a flat file using pg_dump may create a very large flat file.  Make sure you use compression ("-Fc" or pipe the dump through gzip, which will use two CPU's, one for pg_dump and one for gzip) on the dump file to minimize its size.  If the dump is successful you don't have to drop the table, you can just truncate it and that will recover the used space and hand it back to the file system.  Truncate is very fast, but then again so is drop table.  Be careful.  You can then use pg_restore to put the data  back into the original table and that will reclaim the space.

Take what Kevin said earlier about autovacuum and possible scheduled vacuum analyze verbose jobs to make dead space reusable very, very seriously.  Upgrading to 8.4 will remove any need to manually manage the free space map in 8.2 and is worth it, particularly to get a nicer version of autovacuum, although there are differences in automatic casting of data type between 8.2 and 8.4 (and 9.x) that you should test before making a wholesale commitment to upgrading.  The fixes to your code aren't hard, but need to be done for you to get consistent results pre- and post-upgrade.

Finally, if there are natural partitions to the data in that table consider using PostgreSQL's partition feature.  What those partitions are depends entirely on your use case(s).  If the technique fits, you would be able to drop and create new partitions to clear out and populate data quite quickly.  Check out http://www.postgresql.org/docs/8.4/interactive/ddl-partitioning.html for details of partitioning in PG 8.4.

Good luck!

Bob Lunney

----- Original Message -----
From: David Ondrejik <David.Ondrejik@xxxxxxxx>
To: pgsql-admin <pgsql-admin@xxxxxxxxxxxxxx>
Cc: 
Sent: Thursday, July 21, 2011 2:12 PM
Subject: Re:  vacuumdb question/problem

I think I see a (my) fatal flaw that will cause the cluster to fail.

>>  From the info I received from previous posts, I am going to change
>> my game plan. If anyone has thoughts as to different process or
>> can confirm that I am on the right track, I would appreciate your
>> input.
>> 
>> 1. I am going to run a CLUSTER on the table instead of a VACUUM
>> FULL.
Kevin Grittner stated:
> If you have room for a second copy of your data, that is almost
> always much faster, and less prone to problems.

I looked at the sizes for the tables in the database and the table I am trying to run the cluster on is 275G and I only have 57G free.  I don't know how much of that 275G has data in it and how much is empty to allow for a second copy of the data. I am guessing the cluster would fail due to lack of space.

Are there any other options??

If I unload the table to a flat file; then drop the table from the database; then recreate the table; and finally reload the data - will that reclaim the space?

Kevin - thanks for the book recommendation.  Will order it tomorrow.

Thanks again for all the technical help!

Dave

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin