Re: estimating the need for VACUUM FULL and REINDEX

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Heikki Linnakangas <heikki 'at' enterprisedb.com> writes:

> Guillaume Cottenceau wrote:
> > According to documentation[1], VACUUM FULL's only benefit is
> > returning unused disk space to the operating system; am I correct
> > in assuming there's also the benefit of optimizing the
> > performance of scans, because rows are physically compacted on
> > the disk?
> 
> That's right.

Ok. Then I think the documentation should probably be updated? It
seems to totally miss this benefit.

We've been hit by degrading performance, probably because of too
seldom VACUUM ANALYZE, and in this situation it seems that the
two solutions are either VACUUM FULL or dumping and recreating
the database. Maybe this situation should be described in the
documentation. In this list, everyone always say "you should
VACUUM ANALYZE frequently" but little is done to consider the
case when we have to deal with an existing database on which this
hasn't been done properly.
 
> > With that in mind, I've tried to estimate how much benefit would
> > be brought by running VACUUM FULL, with the output of VACUUM
> > VERBOSE. However, it seems that for example the "removable rows"
> > reported by each VACUUM VERBOSE run is actually reused by VACUUM,
> > so is not what I'm looking for.
> 
> Take a look at contrib/pgstattuple. If a table has high percentage of
> free space, VACUUM FULL will compact that out.

Thanks a lot. I've followed this path and I think it should be
said that free_space must also be large compared to 8K -
free_percent can be large for tables with very few tuples even on
already compacted tables.
 
> > Then according to documentation[2], REINDEX has some benefit when
> > all but a few index keys on a page have been deleted, because the
> > page remains allocated (thus, I assume it improves index scan
> > performance, am I correct?). However, again I'm unable to
> > estimate the expected benefit. With a slightly modified version
> > of a query found in documentation[3] to see the pages used by a
> > relation[4], I'm able to see that the index data from a given
> > table...
> 
> See pgstatindex, in the same contrib-module. The number you're looking
> for is avg_leaf_density. REINDEX will bring that to 90% (with default
> fill factor), so if it's much lower than that REINDEX will help.

Woops, seems that this was not availabe in pgstattuple of pg 7.4 :/

-- 
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux