Re: Deceiding which index to use

Mezei Zoltán <mezei.zoltan@xxxxxxxxxx> · Fri, 09 Mar 2007 16:13:47 +0100

Title: Re: [PERFORM] Deceiding which index to use

Richard Huxton wrote:

  And does the planner know that?

SELECT * FROM pg_stats WHERE tablename='subscriber' AND
attname='anumber';

It's the n_distinct you're interested in, and perhaps most_common_freqs.

n_distinct is -0.359322 and most_common_vals contains about 10
different anumbers (which are corretct), most_common_freqs are between
0.01 and 0.001. What does n_distinct exactly mean? Why is it negative?

  > Nothing, everything is the same - the problem
lies on the other table's index

> usage, using this index is fine.

The planner has to guess how many matches it will have for

subscriber=5555555. Based on that choice, it will either:

   a. Do the join, then find the highest crd values (sort)

   b. Scan the crd values backwards and then join

It's chosen (b) because it's estimating the numbers of matches

incorrectly. I'm wondering whether the system can't see through your

function-call (the cast to numeric) to determine how many matches it's

going to get for any given value.

It can see through the cast - I have just tried to create the
same database omitting the non-numeric anumbers and the results are the
same.

  If the system can't be persuaded into getting its
estimates more

accurate, it might be worth trying an index on (subscriber_id,crd) and

dropping the index on (crd) - if that's reasonable for your query
patterns.

I'll try that one if the negative n_distinct value can be a
correct one :-)

Zizi