IN operator causes sequential scan (vs. multiple OR expressions)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



PostgreSQL version: 8.2.1
OS: Windows Server 2003

I have a relatively simple query where the planner chooses a sequential scan when using the IN operator but chooses an index scan when using logically equivalent multiple OR expressions. Here is the table structure and the two versions of the query:

CREATE TABLE pool_sample
(
  id integer NOT NULL,
  state character varying(25) NOT NULL,
  not_pooled_reason character varying(25) NOT NULL,
  "comment" character varying(255),
  CONSTRAINT "pk_poolSample_id" PRIMARY KEY (id)
)
WITHOUT OIDS;

CREATE INDEX "idx_poolSample_state"
  ON pool_sample
  USING btree
  (state);


The following query uses a sequential scan (i.e. filter) on the "state" column and takes about 5 seconds to execute (up to ~45 seconds with an "empty" cache):
SELECT * FROM pool_sample ps
WHERE ps.state IN ('PENDING_REPOOL_REVIEW', 'READY_FOR_REPOOL');

This version of the query uses an index scan on "state" and takes about 50 milliseconds:
SELECT * FROM pool_sample ps
WHERE ps.state = 'PENDING_REPOOL_REVIEW' OR ps.state = 'READY_FOR_REPOOL';

There are over 10 million rows in the pool_sample table and 518 rows meet the given criteria. In the first query, the planner estimates that nearly 10 million rows will be returned (i.e. almost all rows in the table). In the second query, the planner estimates 6830 rows, which seems close enough for the purposes of planning.

If I explicitly cast the state column to text, the IN query uses an index scan and performs just as well as the multiple OR version:
SELECT * FROM pool_sample ps
WHERE ps.state::text IN ('PENDING_REPOOL_REVIEW', 'READY_FOR_REPOOL');

So it would appear that the planner automatically casts the state column to text within an OR expression but does not perform the cast in an IN expression.

Our SQL is generated from an O/R mapper, so it's non-trivial (or at least undesirable) to hand tune every query like this with an explicit type cast. The only option I've come up with is to define the state column as text in the first place, thus avoiding the need to cast. Would this work? Are there any other/better options?

Thanks,
-Ryan






[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux