Two efficiency questions - clustering and ints

"John D. Burger" <john@xxxxxxxxx> · Thu, 5 Oct 2006 13:33:03 -0400

I have a good-size DB (some tables approaching 100M rows), with  
essentially static data.

Should I always cluster the tables?  That is, even if no column jumps  
out as being involved in most queries, should I pick a likely one and  
cluster on it?  (Of course, this assumes that doing so won't cause  
bad correlation with any other oft-used column.)

Another question, about integer types - if no cross-type coercion is  
involved, is there any reason not to choose the smallest int type  
that will fit my data?  In particular, I have a column of small- 
integer ratings with, say, values in [1, 10].  If I'm only comparing  
within such ratings, and possibly computing floating point averages,  
etc., what are the good and bad points of using, say, SMALLINT?  What  
about NUMERIC(1) or (2)?

Thanks in advance for the usual brilliant replies!

- John D. Burger
  MITRE