Version: PostgreSQL 8.3.5 (mammoth replicator) Schema: CREATE TABLE tdiag ( diag_id integer DEFAULT nextval('diag_id_seq'::text), create_time timestamp with time zone default now(), /* time this record was created */ diag_time timestamp with time zone not null, device_id integer, /* optional */ fleet_id integer, /* optional */ customer_id integer, /* optional */ module character varying, node_kind smallint, diag_level smallint, tag character varying not null default '', message character varying not null default '', options text, PRIMARY KEY (diag_id) ); create index tdiag_create_time ON tdiag(create_time); The number of rows is around 33 million with time stamps over the past two weeks. A VACUUM ANALYZE has been done recently on the table. The create_time order is almost identical to the id order. What I want to find is the first or last entry by id in a given time range. The query I am having a problem with is: symstream2=> explain analyze select * from tdiag where (create_time >= '2011-06-03 09:49:04.000000+0' and create_time < '2011-06-06 09:59:04.000000+0') order by diag_id limit 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..16.75 rows=1 width=114) (actual time=69425.356..69425.358 rows=1 loops=1) -> Index Scan using tdiag_pkey on tdiag (cost=0.00..19114765.76 rows=1141019 width=114) (actual time=69425.352..69425.352 rows=1 loops=1) Filter: ((create_time >= '2011-06-03 19:49:04+10'::timestamp with time zone) AND (create_time < '2011-06-06 19:59:04+10'::timestamp with time zone)) Total runtime: 69425.400 ms PG seems to decide it must scan the diag_id column and filter each row by the create_time. If I leave out the limit I get symstream2=> explain analyze select * from tdiag where (create_time >= '2011-06-03 09:49:04.000000+0' and create_time < '2011-06-06 09:59:04.000000+0') order by diag_id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Sort (cost=957632.43..960484.98 rows=1141019 width=114) (actual time=552.795..656.319 rows=86530 loops=1) Sort Key: diag_id Sort Method: external merge Disk: 9872kB -> Bitmap Heap Scan on tdiag (cost=25763.48..638085.13 rows=1141019 width=114) (actual time=43.232..322.441 rows=86530 loops=1) Recheck Cond: ((create_time >= '2011-06-03 19:49:04+10'::timestamp with time zone) AND (create_time < '2011-06-06 19:59:04+10'::timestamp with time zone)) -> Bitmap Index Scan on tdiag_create_time (cost=0.00..25478.23 rows=1141019 width=0) (actual time=42.574..42.574 rows=86530 loops=1) Index Cond: ((create_time >= '2011-06-03 19:49:04+10'::timestamp with time zone) AND (create_time < '2011-06-06 19:59:04+10'::timestamp with time zone)) Total runtime: 736.440 ms (8 rows) I can be explicit about the query order: select * into tt from tdiag where (create_time >= '2011-06-03 09:49:04.000000+0' and create_time < '2011-06-06 09:59:04.000000+0'); symstream2=> explain analyze select * from tt order by diag_id limit 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Limit (cost=2731.95..2731.95 rows=1 width=101) (actual time=440.165..440.166 rows=1 loops=1) -> Sort (cost=2731.95..2948.28 rows=86530 width=101) (actual time=440.161..440.161 rows=1 loops=1) Sort Key: diag_id Sort Method: top-N heapsort Memory: 17kB -> Seq Scan on tt (cost=0.00..2299.30 rows=86530 width=101) (actual time=19.602..330.873 rows=86530 loops=1) Total runtime: 440.209 ms (6 rows) But if I try using a subquery I get symstream2=> explain analyze select * from (select * from tdiag where (create_time >= '2011-06-03 09:49:04.000000+0' and create_time < '2011-06-06 09:59:04.000000+0')) as sub order by diag_id limit 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..16.75 rows=1 width=114) (actual time=90344.384..90344.385 rows=1 loops=1) -> Index Scan using tdiag_pkey on tdiag (cost=0.00..19114765.76 rows=1141019 width=114) (actual time=90344.380..90344.380 rows=1 loops=1) Filter: ((create_time >= '2011-06-03 19:49:04+10'::timestamp with time zone) AND (create_time < '2011-06-06 19:59:04+10'::timestamp with time zone)) Total runtime: 90344.431 ms How do I make this both fast and simple? -- Anthony Shipman | flailover systems: When one goes down it Anthony.Shipman@xxxxxxxxxxxxx | flails about until the other goes down too. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance