Re: Two efficiency questions - clustering and ints

Casey Duncan <casey@xxxxxxxxxxx> · Fri, 6 Oct 2006 11:36:51 -0700

On Oct 6, 2006, at 11:12 AM, John D. Burger wrote:

Richard Huxton wrote:

Should I always cluster the tables?  That is, even if no column  
jumps out as being involved in most queries, should I pick a  
likely one and cluster on it?

Well you cluster on an index, and if you don't think the index is  
useful, I'd drop it. If you have an index, clustering isn't  
necessarily going to help you unless you regularly read a series  
of rows in order.

Right, but clustering on that index means an index scan (with that  
index) will be faster.  This is uninteresting if the index doesn't  
get used, but is there any =downside= to doing it?

Here's a simpler question - for static data, should I always  
cluster on the index I think will be used the most?

Depends on the queries. If the index is on a foreign key value where  
there may be many rows with the same key scattered about it will help  
queries that lookup using that foreign key. Clustering on a column  
with high cardinality isn't much of a win typically for single key  
lookups (depends on the lookup pattern), but would be for ranges and  
possibly for sorting on that column.

It also depends on the size of the table and indices. If they are  
small enough to fit in memory then clustering to reduce random access  
isn't really helpful.

I would suggest doing some timing tests on typical queries with the  
data unclustered and clustered to know what you are gaining.

-Casey