Planner chooses multi-column index in 9.2 when maybe it should not

Greg Sabino Mullane <greg@xxxxxxxxxxxx> · Wed, 10 Oct 2012 13:06:55 -0400

We are seeing a performance regression when moving to 9.2. There is a 
complex query that is doing a self-join, but long story short, it 
is choosing to use a multi-column index when it really ought not to be.
I was not able to develop a standalone test case without resorting 
to changing enable_seqscan, but this does show the difference:

CREATE TABLE gregtest (a TEXT PRIMARY KEY, b TEXT);
CREATE INDEX gregtest_i ON gregtest(b,a);
SET enable_seqscan = off;
EXPLAIN SELECT 1 FROM gregtest WHERE a <> 'panda' AND b <> 'yak';

On versions 8.2, 8.3, 8.4, 9.0, and 9.1, this gives:

 Seq Scan on gregtest  (cost=10000000000.00..10000000022.90 rows=855 width=0)
   Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))

Which makes sense - I would imagine that b = 'yak' would use the index, 
but the negation means the index is not very useful?

However, on 9.2, this gives:

 Bitmap Heap Scan on gregtest  (cost=8.76..31.66 rows=855 width=0)
   Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))
   ->  Bitmap Index Scan on gregtest_i  (cost=0.00..8.55 rows=860 width=0)

The above was tested on stock versions of Postgres, with no changes 
made to postgresql.conf. In the actual query, the result is something like 
this on 9.2 (columns explained below):

 Nested Loop  (cost=0.00..6050226723847.12 rows=282638194054762 width=8)
   Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
   ->  Index Only Scan using index_i on foobar m  (cost=0.00..889187.83 rows=16998032 width=8)
   ->  Materialize  (cost=0.00..1079773.42 rows=16627702 width=8)
         ->  Index Only Scan using index_i on foobar o  (cost=0.00..931682.91 rows=16627702 width=8)
               Filter: ((status)::text <> 'split'::text)

But like this on 8.3:

 Nested Loop  (cost=1003294.60..8207409555713.15 rows=283931552087940 width=8)
   Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
   ->  Seq Scan on foobar o  (cost=0.00..962314.95 rows=16672204 width=8)
         Filter: ((status)::text <> 'split'::text)
   ->  Materialize  (cost=1003294.60..1240121.96 rows=17030236 width=8)
         ->  Seq Scan on foobar m  (cost=0.00..919739.36 rows=17030236 width=8)

In the above, foobar has a primary key on foo, and an index named 
index_i on foobar(status, foo). In another variation of the query, 
8.3 uses foobar_pkey as well, rather than index_i, and filters that. 
Matter of fact, index_i is never used.

At any rate, the change causes the original query to run much, much 
slower. Problem on 9.2? Something wrong with our system and/or query? 
More information needed from me?

-- 
Greg Sabino Mullane greg@xxxxxxxxxxxx
End Point Corporation
PGP Key: 0x14964AC8
Attachment:
pgpyaKm7ZgEDx.pgp

Description: PGP signature