On Sun, 2008-03-30 at 21:22 +0200, Joris Dobbelsteen wrote: > From the top contenders, about half are indexes, so you are stuck > with > ~200 MB of data in the tables. > Postgresql has some wasted space due to placement of the tuples in a > block and overhead for each block and row. I don't know those values, > but they are in the range of 24 bytes per tuple, I believe. Secondly > a > block is 8 KB by default and tuples cannot be stored into multiple > blocks (thats what toast to work around). > > All in all: Lookup tuple sizes, if they are small than the overhead > from > postgresql can be a big factor. If you are huge you loose on portions > of > unoccupied space in blocks. I believe pg_statistics will provide this > information. There is a pg_statistic (no "s") table, but I don't know how to get tuple size from it--the documentation refers to the source code to figure out the codes. Backing up a step, I don't know what a tuple is in Postgres, and don't see an entry for it in the index. Is a tuple just a row? That's what the docs say for the following report: # select distinct relname, reltuples, relpages from pg_class where relkind='r' and substring(relname, 1, 3) != 'pg_'; relname | reltuples | relpages -------------------------+-----------+---------- basefiles | 0 | 0 cdimages | 0 | 0 client | 2 | 1 counters | 1 | 1 device | 0 | 0 file | 650659 | 11558 filename | 623012 | 3958 fileset | 22 | 1 job | 384 | 10 jobmedia | 596 | 7 location | 0 | 0 locationlog | 0 | 0 log | 0 | 0 media | 245 | 9 mediatype | 2 | 1 path | 67908 | 966 pool | 5 | 1 sql_features | 439 | 6 sql_implementation_info | 12 | 1 sql_languages | 4 | 1 sql_packages | 10 | 1 sql_parts | 9 | 1 sql_sizing | 23 | 1 sql_sizing_profiles | 0 | 0 status | 19 | 1 storage | 2 | 1 unsavedfiles | 0 | 0 version | 1 | 1 > Another factor is representation in the SQL dump might be more > efficient > than in the database, but this highly depends on your data set. For > example, a int8 takes 8 bytes in a table, while it takes between 1 > and > ~20 in a SQL dump. > > How the plain SQL dump becomes this small I cannot explain without > much > much more details. On Tom's point, bacula regularly inserts entries into the tables and then, days to months later, deletes them. As far as I know, the VACUUM FULLs I just did were the first ever; I did do several of them because I kept getting messages about needing more fsm_pages. I am still trying to figure out if the database was getting any automatic vacuuming at all. The Postgres documentation (the database is 8.2, though I'm moving to 8.3 soon) sounds as if it's on automatically, but the Debian-specific documentation suggests I may need to do some additional things to enable it. Probably the fsm_pages being low also hurt disk useage, since the message accompanying the vacuum said that's what happens if fsm_pages is low. It was 20k; vacuum said I needed 56k, and I upped it to 80k. I'm not sure if my recent cleaning has brought the needed fsm_pages down. I've only been doing partial backups for the last few months, so there's probably less info in the tables than under normal conditions. I suppose it's possible the space I gained was just a temporary win. Ross -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general