Search Postgresql Archives

Re: database 1.2G, pg_dump 73M?!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ross Boylan wrote:
I have a postgres server for which du reports
1188072	/var/lib/postgresql/8.2/main
on  Linux system.
The server has only one real database, which is for bacula.  When I dump
the database, it's 73Mg.

This is immediately after I did a full vacuum and restarted the server.

Also,
bacula=> SELECT relname, relpages FROM pg_class ORDER BY relpages DESC
limit 15;
relname | relpages ---------------------------------+----------
 file_jpfid_idx                  |    27122
 file_pathid_idx                 |    17969
 file_jobid_idx                  |    17948
 file_pkey                       |    14580
 file_fp_idx                     |    12714
 file                            |    11558
 file_filenameid_idx             |     9806
 filename                        |     3958
 filename_name_idx               |     2510
 filename_pkey                   |     1367
 path                            |      966
 path_name_idx                   |      950
 path_pkey                       |      151
 pg_attribute_relid_attnam_index |       46
 pg_proc                         |       45

It seems very strange to me that there is such a difference in size
between the dump and the database: the data store is almost 15 time
larger than the dump.

Is this to be expected (e.g., from the indices taking up disk space)?
Is there anything I can do to reclaim some disk space
There are a few factors you need to take into account:

   * Data storage in the database is packed into blocks and contains
     header data. Since data needs to be put into blocks there is a
     potential for waisting space. If you are unlucky it can become
     nearly a single row in the worst case.
   * You need to vacuum often, to ensure obsolete rows are removed and
     space can be reused.
   * Tables are not reduced in size and only grown. I thinks cluster
     and vacuum full will reduce the size of your table.
   * Indexes are not in the backup, they are derived from the table
     data on a restore.
     If you remove the indexes you are left with 150~200 MB of data (I
     guessed).
     Doing reindex will rebuild the index and get rid of all the bloat
     it has been collected during use.  _I recommend you try this_, as
     your indexes on the file table look quite huge.

But the most important factor for you will be the following:

   * Backups are compressed. Since you store filenames and paths these
     will have a very high amount of regularity and therefore are very
     good targets for compression. This can save a huge amount of data.
     If you take a compression factor of 50%~70% you will reach your 70 MB.

Ow, server restarts will not help reduce your database size. In fact, nothing at all should change, except lower performance until sufficient cached data is back in the cache again.

Hope this helps...

- Joris

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux