Re: help understanding explain output

Chris <dmagick@xxxxxxxxx> · Wed, 16 Feb 2011 10:08:52 +1100

On 16/02/11 01:49, Luca Ferrari wrote:
Hello,
I've got a doubt about partial indexes and the path chosen by the optimizer.
Consider this simple scenario:

CREATE TABLE p( pk serial NOT NULL , val2 text, val1 text, b boolean, PRIMARY
KEY (pk) );
INSERT INTO p(pk, val1, val2, b) VALUES( generate_series(1,1000000), 'val1b',
'val2b', true );
INSERT INTO p(pk, val1, val2, b) VALUES( generate_series(1000001,2000000),
'val1Notb', 'val2Notb', false );
CREATE INDEX i_p_b ON p (b) WHERE b = true;
ANALYZE p;

So I create a table with 2-million rows, the first million with b = true and
the second one with b = false.
Now doing an explain for a query that selects only on the b attribute I got:

EXPLAIN SELECT * FROM p WHERE b = false;
                          QUERY PLAN
------------------------------------------------------------
  Seq Scan on p  (cost=0.00..34706.00 rows=1000133 width=28)
    Filter: (NOT b)

So a sequential scan. I know that the optimizer will not consider an index if
it is not filtering, but I don't understand exactly why in this case.

It is filtering, but it's not filtering enough - you're hitting 1 out of 
2 rows. Postgres doesn't keep information about the data visibility in 
the indexes so if it were to do an index scan, it would need to check 
the index for filtering and then go back to the data to see if it's 
still correct the current transaction.

So doing that double hit isn't worth it - instead it just trawls through 
the data files to find the right rows.

--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general