Re: query planner not using the correct index

Tom Lane <tgl@xxxxxxxxxxxxx> · Thu, 07 Aug 2008 19:48:11 -0400

"Joshua Shanks" <jjshanks@xxxxxxxxx> writes:
> SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM
> pg_stats WHERE tablename = 'bars' AND attname='bars_id';
>  null_frac | n_distinct |   most_common_vals   |     most_common_freqs
> -----------+------------+----------------------+---------------------------
>          0 |         14 | {145823,47063,24895} | {0.484667,0.257333,0.242}

> Those 3 values in reality and in the stats account for 98% of the
> rows. actual distinct values are around 350

So you need to increase the stats target for this column.  With those
numbers the planner is going to assume that any value that's not one
of the big three appears about (1 - (0.484667+0.257333+0.242)) / 11
of the time, or several hundred times in 300K rows.  If n_distinct were
up around 350 it would be estimating just a dozen or so occurrences,
which should push the join plan into the shape you want.  It's likely
that it won't bother to include any more entries in most_common_vals
no matter how much you raise the target; but a larger sample should
definitely give it a better clue about n_distinct.

			regards, tom lane