(Questioning the planner's mind) - was Re: Fastest way to drop an index?

Phoenix Kiula <phoenix.kiula@xxxxxxxxx> · Wed, 4 Feb 2009 12:23:59 +0800

On Wed, Feb 4, 2009 at 5:13 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> Phoenix Kiula <phoenix.kiula@xxxxxxxxx> writes:
>>  Index Scan using new_idx_testimonials_userid on testimonials
>> (cost=0.00..157.78 rows=1 width=9) (actual time=8809.715..8809.715
>> rows=0 loops=1)
>>    Index Cond: ((user_id)::text = 'superman'::text)
>>    Filter: ((title_encrypted)::text = 'b333dc1b0992cb8c70b58a418211389a'::text)
>>  Total runtime: 8809.750 ms
>
> This is using the index to fetch the rows that match user_id = 'superman',
> and then testing each fetched row to see if it has the desired value of
> title_encrypted.  The fact that hardly any rows pass the filter test
> tells nearly nothing about how long this should be expected to run.
> The rather high estimated cost suggests that the planner thinks there
> are several dozen rows matching the index condition, and the actual
> runtime suggests that there are actually hundred or thousands of 'em.
> If so, your problem is that you need a different index.  I'd bet on an
> index on title_encrypted being more useful for this query than the one
> on user_id; or you could experiment with a two-column index.
>

Thanks Tom. My thinking exactly.

So I have made a two column index on (user_id, title_encrypted)
already. It's done.

But the planner keeps insisting on using the "user_id" as you see from
that EXPLAIN ANALYZE. This was done when the other two-col index
already exists!

Could I force the 2-col index? I googled for "force index postgresql"
and came upon this discussion -
http://archives.postgresql.org/pgsql-sql/2006-02/msg00190.php  - which
suggests that the planner may be selecting indexes based on "cost'.

I am not too technically savvy, but I think this means that given the
choice of these two scenarios...

    1. Search through "user_id" index, and then limit it by "title_encrypted"
or
    2. Search through "user_id, title_encrypted" 2-col index

...the planner decides that it is less resource intensive to go
through the somewhat smaller user_id index and then limit it (i.e.,
scenario 1) than to wade through the bigger second index.

Am I on the right track? If I am, well what's the way around this? How
can I make the planner make use of the 2-col index? Or if my
understanding is not right, why is the scenario 1 being chosen to
begin with?

Thanks for any thoughts! This single query, which used to be much
faster than this, is now slowing down our operations by about 8 second
per query!

Let me know if you need to know any pgsql.conf settings. Only index
related setting I know of are these:

  enable_indexscan             = on
  enable_bitmapscan            = off
  enable_nestloop              = on

Thanks

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general