Heikki Linnakangas <heikki 'at' enterprisedb.com> writes: > Guillaume Cottenceau wrote: > > According to documentation[1], VACUUM FULL's only benefit is > > returning unused disk space to the operating system; am I correct > > in assuming there's also the benefit of optimizing the > > performance of scans, because rows are physically compacted on > > the disk? > > That's right. Ok. Then I think the documentation should probably be updated? It seems to totally miss this benefit. We've been hit by degrading performance, probably because of too seldom VACUUM ANALYZE, and in this situation it seems that the two solutions are either VACUUM FULL or dumping and recreating the database. Maybe this situation should be described in the documentation. In this list, everyone always say "you should VACUUM ANALYZE frequently" but little is done to consider the case when we have to deal with an existing database on which this hasn't been done properly. > > With that in mind, I've tried to estimate how much benefit would > > be brought by running VACUUM FULL, with the output of VACUUM > > VERBOSE. However, it seems that for example the "removable rows" > > reported by each VACUUM VERBOSE run is actually reused by VACUUM, > > so is not what I'm looking for. > > Take a look at contrib/pgstattuple. If a table has high percentage of > free space, VACUUM FULL will compact that out. Thanks a lot. I've followed this path and I think it should be said that free_space must also be large compared to 8K - free_percent can be large for tables with very few tuples even on already compacted tables. > > Then according to documentation[2], REINDEX has some benefit when > > all but a few index keys on a page have been deleted, because the > > page remains allocated (thus, I assume it improves index scan > > performance, am I correct?). However, again I'm unable to > > estimate the expected benefit. With a slightly modified version > > of a query found in documentation[3] to see the pages used by a > > relation[4], I'm able to see that the index data from a given > > table... > > See pgstatindex, in the same contrib-module. The number you're looking > for is avg_leaf_density. REINDEX will bring that to 90% (with default > fill factor), so if it's much lower than that REINDEX will help. Woops, seems that this was not availabe in pgstattuple of pg 7.4 :/ -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36