On Sun, 2008-03-30 at 20:27 +0200, Joris Dobbelsteen wrote: > Ross Boylan wrote: > > I have a postgres server for which du reports > > 1188072 /var/lib/postgresql/8.2/main > > on Linux system. > > The server has only one real database, which is for bacula. When I dump > > the database, it's 73Mg. > > > > This is immediately after I did a full vacuum and restarted the server. > > > > Also, > > bacula=> SELECT relname, relpages FROM pg_class ORDER BY relpages DESC > > limit 15; > > relname | relpages > > ---------------------------------+---------- > > file_jpfid_idx | 27122 > > file_pathid_idx | 17969 > > file_jobid_idx | 17948 > > file_pkey | 14580 > > file_fp_idx | 12714 > > file | 11558 > > file_filenameid_idx | 9806 > > filename | 3958 > > filename_name_idx | 2510 > > filename_pkey | 1367 > > path | 966 > > path_name_idx | 950 > > path_pkey | 151 > > pg_attribute_relid_attnam_index | 46 > > pg_proc | 45 > > > > It seems very strange to me that there is such a difference in size > > between the dump and the database: the data store is almost 15 time > > larger than the dump. > > > > Is this to be expected (e.g., from the indices taking up disk space)? > > Is there anything I can do to reclaim some disk space > There are a few factors you need to take into account: > > * Data storage in the database is packed into blocks and contains > header data. Since data needs to be put into blocks there is a > potential for waisting space. If you are unlucky it can become > nearly a single row in the worst case. > * You need to vacuum often, to ensure obsolete rows are removed and > space can be reused. > * Tables are not reduced in size and only grown. I thinks cluster > and vacuum full will reduce the size of your table. > * Indexes are not in the backup, they are derived from the table > data on a restore. > If you remove the indexes you are left with 150~200 MB of data (I > guessed). > Doing reindex will rebuild the index and get rid of all the bloat > it has been collected during use. _I recommend you try this_, as > your indexes on the file table look quite huge. reindexing had a huge effect. After reindex the top tables (file, filename and path) I now see SELECT relname, relpages FROM pg_class ORDER BY relpages DESC limit 15; relname | relpages ---------------------------------+---------- file | 11558 filename | 3958 filename_name_idx | 2383 file_jpfid_idx | 2145 file_fp_idx | 1787 file_jobid_idx | 1427 file_pathid_idx | 1427 file_pkey | 1427 file_filenameid_idx | 1427 filename_pkey | 1367 path | 966 path_name_idx | 871 path_pkey | 151 pg_attribute_relid_attnam_index | 46 pg_proc | 45 and du now reports 451M. That still seems a bit large, given the size of the sql dump, but it's almost 2/3 lower than it was before. Thanks so much! I guess I need to figure out how to reindex automatically. > > But the most important factor for you will be the following: > > * Backups are compressed. Since you store filenames and paths these > will have a very high amount of regularity and therefore are very > good targets for compression. This can save a huge amount of data. > If you take a compression factor of 50%~70% you will reach your 70 MB. I don't see how this is relevant, since my dump file was plain text (sql). > Ow, server restarts will not help reduce your database size. In fact, > nothing at all should change, except lower performance until sufficient > cached data is back in the cache again. > > Hope this helps... It was a huge help. > > - Joris -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general