Re: [Again] Postgres performance problem

El-Lotso <el.lotso@xxxxxxxxx> · Thu, 13 Sep 2007 15:20:13 +0800

On Thu, 2007-09-13 at 01:58 -0400, Greg Smith wrote:
> On Wed, 12 Sep 2007, Scott Marlowe wrote:
> 
> > I'm getting more and more motivated to rewrite the vacuum docs.  I think 
> > a rewrite from the ground up might be best...  I keep seeing people 
> > doing vacuum full on this list and I'm thinking it's as much because of 
> > the way the docs represent vacuum full as anything.
> 
> I agree you shouldn't start thinking in terms of how to fix the existing 
> documentation.  I'd suggest instead writing a tutorial leading someone 
> through what they need to know about their tables first and then going 
> into how vacuum works based on that data.

I'm new to PG and it's true that I am confused.
As it stands this is a newbie's understanding of the various terms.

cluster -> rewrites a table according to index order so that IO is
ordered/sequential 
reindex -> basically, rewrites the indexes adding new records/fixes up
old deleted records
vacuum -> does cleaning 
vacuum analyse -> clean and update statistics (i run this mostly)
autovacuum - does vacuum analyse automatically per default setup or some
or cost based parameter

vacuum full -> I also do this frequently (test DB only) as a means to
retrieve back used spaces due to MVCC. (trying lots of different methods
of query/add new index/make concatenated join/unique keys and then
deleting them if it's not useful) 

> 
> As an example, people throw around terms like "index bloat" and "dead 
> tuples" when talking about vacuuming.  

I honestly have only the vaguest idea what these 2 mean. (i only grasped
recently that tuples = records/rows)

> The tutorial I'd like to see 
> somebody write would start by explaining those terms and showing how to 
> measure them--preferably with a good and bad example to contrast.  The way 
> these terms are thrown around right now, I don't expect newcomers to 
> understand either the documentation or the advice people are giving them; 
> I think it's shooting over their heads and what's needed are some 
> walkthroughs.  Another example I'd like to see thrown in there is what it 
> looks like when you don't have enough FSM slots.

actually, an additional item I would like is to understand explain
analyse. The current docs written by tom only shows explain and not
explain analyse and I'm getting confuse as to the rows=xxx vs actual
rows=yyy where on some of my queries can be very far apart 1 vs 500x
ratio on some problematic query[1]. And googling doesn't give much doc
on the explain. (the only other useful doc I've seen is a presentation
given from oscon 2003)

[1](See my other post)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match