Search Postgresql Archives

Re: why is pg_dump so much smaller than my database?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Interesting.  Is there a perf hit to having a big file on disk?  My understanding is that the primary thing that really matters is keeping your active set in memory.

This is on Postgres 9.0.x, running on Heroku/ec2.

We do have extremely compressible data so it may be that the dump is compressed: I'm downloading it now to check.

Thanks for the replies,
Carson

On Thu, Mar 29, 2012 at 12:11 AM, John R Pierce <pierce@xxxxxxxxxxxx> wrote:
On 03/28/12 10:32 PM, Carson Gross wrote:
I've got a pretty big database (~30 gigs) and when I do a pg_dump, it ends up only being 2 gigs.

The database consists mainly of one very large table (w/ a few varchar columns) which, according to pg_relation_size() is 10 gigs and pg_total_relation_size() is 26 gigs (we need to drop some indexes there.)

I'm just trying to get my head around the pg_dump being an order of magnitude smaller than the darned database itself.  I would thing that the db would offer more efficient encoding for a lot of stuff vs. an ascii file.


its quite possible your table has a lot of free tuples scattered through it as a result of updates or deletes.   vacuum makes these available for reuse but does NOT free the disk space.  ditto, your indexes might be very bloated, a reindex may significantly shrink them

if you can afford some application downtime, you may consider running CLUSTER on that table, it will copy all the active tuples of the table to new file space, and free the old, and also does the reindex operation.   I would vacuum the table first, after ensuring there aren't any old active transactions ('IDLE IN TRANSACTION' status in pg_stat_activity).   Note that cluster takes an exclusive lock on the table, this is why I said you need some application downtime.

you don't say what version you're running, older versions had more problems with bloating indexes than newer ones.







--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux