On Thu, 2007-09-13 at 01:58 -0400, Greg Smith wrote: > On Wed, 12 Sep 2007, Scott Marlowe wrote: > > > I'm getting more and more motivated to rewrite the vacuum docs. I think > > a rewrite from the ground up might be best... I keep seeing people > > doing vacuum full on this list and I'm thinking it's as much because of > > the way the docs represent vacuum full as anything. > > I agree you shouldn't start thinking in terms of how to fix the existing > documentation. I'd suggest instead writing a tutorial leading someone > through what they need to know about their tables first and then going > into how vacuum works based on that data. I'm new to PG and it's true that I am confused. As it stands this is a newbie's understanding of the various terms. cluster -> rewrites a table according to index order so that IO is ordered/sequential reindex -> basically, rewrites the indexes adding new records/fixes up old deleted records vacuum -> does cleaning vacuum analyse -> clean and update statistics (i run this mostly) autovacuum - does vacuum analyse automatically per default setup or some or cost based parameter vacuum full -> I also do this frequently (test DB only) as a means to retrieve back used spaces due to MVCC. (trying lots of different methods of query/add new index/make concatenated join/unique keys and then deleting them if it's not useful) > > As an example, people throw around terms like "index bloat" and "dead > tuples" when talking about vacuuming. I honestly have only the vaguest idea what these 2 mean. (i only grasped recently that tuples = records/rows) > The tutorial I'd like to see > somebody write would start by explaining those terms and showing how to > measure them--preferably with a good and bad example to contrast. The way > these terms are thrown around right now, I don't expect newcomers to > understand either the documentation or the advice people are giving them; > I think it's shooting over their heads and what's needed are some > walkthroughs. Another example I'd like to see thrown in there is what it > looks like when you don't have enough FSM slots. actually, an additional item I would like is to understand explain analyse. The current docs written by tom only shows explain and not explain analyse and I'm getting confuse as to the rows=xxx vs actual rows=yyy where on some of my queries can be very far apart 1 vs 500x ratio on some problematic query[1]. And googling doesn't give much doc on the explain. (the only other useful doc I've seen is a presentation given from oscon 2003) [1](See my other post) ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match