On 16/02/11 01:49, Luca Ferrari wrote:
Hello, I've got a doubt about partial indexes and the path chosen by the optimizer. Consider this simple scenario: CREATE TABLE p( pk serial NOT NULL , val2 text, val1 text, b boolean, PRIMARY KEY (pk) ); INSERT INTO p(pk, val1, val2, b) VALUES( generate_series(1,1000000), 'val1b', 'val2b', true ); INSERT INTO p(pk, val1, val2, b) VALUES( generate_series(1000001,2000000), 'val1Notb', 'val2Notb', false ); CREATE INDEX i_p_b ON p (b) WHERE b = true; ANALYZE p; So I create a table with 2-million rows, the first million with b = true and the second one with b = false. Now doing an explain for a query that selects only on the b attribute I got: EXPLAIN SELECT * FROM p WHERE b = false; QUERY PLAN ------------------------------------------------------------ Seq Scan on p (cost=0.00..34706.00 rows=1000133 width=28) Filter: (NOT b) So a sequential scan. I know that the optimizer will not consider an index if it is not filtering, but I don't understand exactly why in this case.
It is filtering, but it's not filtering enough - you're hitting 1 out of 2 rows. Postgres doesn't keep information about the data visibility in the indexes so if it were to do an index scan, it would need to check the index for filtering and then go back to the data to see if it's still correct the current transaction.
So doing that double hit isn't worth it - instead it just trawls through the data files to find the right rows.
-- Postgresql & php tutorials http://www.designmagick.com/ -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general