On Wed, Jun 1, 2011 at 7:54 PM, Robert James <srobertjames@xxxxxxxxx> wrote: > Hi. I'm interested in understanding the differences between > CLUSTERing a table and making a dedicated one. > > We have a table with about 1 million records. On a given day, only > about 1% of them are of interest. That 1% changes every day (it's > WHERE active_date = today), and so we index and cluster on it. > > Even so, the planner shows a very large cost for the Index Scan: about > 3500. If I instead do a SELECT INTO temp_table FROM big_table WHERE > active_date = today, and then do SELECT * FROM temp_table, I get a > planned cost of 65. Yet, the actual time for both queries is almost > identical. > > Questions: > 1. Why is there such a discrepancy between the planner's estimate and > the actual cost? > > 2. In a case like this, will I in general see a performance gain by > doing a daily SELECT INTO and then querying from that table? My ad hoc > test doesn't indicate I would (despite the planner's prediction), and > I'd rather avoid this if it won't help. > > 3. In general, does CLUSTER provide all the performance benefits of a > dedicated table? If it doesn't, what does it lack? no. i suspect you may be over thinking the problem -- what led you to want to cluster in the first place? merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance