Hi everybody, I have a performance-problem with a query using a LIMIT. There are other threads rergading performance issues with LIMIT, but I didn't find useful hints for our problem and it might be interesting for other postgres-users. There are only 2 simple tables: CREATE TABLE newsfeed ( id varchar(32) PRIMARY KEY, version int4 NOT NULL, newsfeed_type varchar(20) NOT NULL, new_item_count int4 NOT NULL ); CREATE INDEX IDX_NEWSFEED_TYPE ON newsfeed (newsfeed_type); CREATE TABLE newsfeed_item ( id varchar(32) PRIMARY NOT NULL, item_type varchar(35) NOT NULL, version int4 NOT NULL, category varchar(25) NULL, data1 bytea NULL, data2 bytea NULL, date_time timestamp NOT NULL, guid1 varchar(32) NULL, guid2 varchar(32) NULL, guid3 varchar(32) NULL, id1 int8 NULL, id2 int8 NULL, long_value1 int8 NULL, long_value2 int8 NULL, long_value3 int8 NULL, string_value1 varchar(4000) NULL, string_value2 varchar(500) NULL, string_value3 varchar(500) NULL, string_value4 varchar(500) NULL, string_value5 varchar(500) NULL, string_value6 varchar(500) NULL, newsfeed varchar(32) NOT NULL ); CREATE UNIQUE INDEX newsfeed_item_pkey ON newsfeed_item (id); CREATE INDEX idx_nfi_guid1 ON newsfeed_item (guid1); CREATE INDEX idx_nfi_guid2 ON newsfeed_item (guid2); CREATE INDEX idx_nfi_guid3 ON newsfeed_item (guid3); CREATE INDEX idx_nfi_id1 ON newsfeed_item (id1); CREATE INDEX idx_nfi_id2 ON newsfeed_item (id2); CREATE INDEX idx_nfi_newsfeed ON newsfeed_item (newsfeed); CREATE INDEX idx_nfi_type ON newsfeed_item (item_type); CREATE INDEX idx_nfi_datetime ON newsfeed_item (date_time); newsfeed contains 457036 rows newsweed_item contains 5169727 rows postgres version: 9.0.2 OS: CentOS release 5.5 (Final) The following query took 4.2 seconds: ------------------------- select * from newsfeed_item where newsfeed in ( '173ee4dcec0d11de9f4f12313c0018c1','10dabde0f70211df816612313b02054e', '17841c9af70211df874b12313b02054e','1783fce2f70211df814412313b02054e','1783fdd2f70211df8c1d12313b02054e','178405a2f70211df829212313b02054e', '178440c6f70211df97c812313b02054e','178416e6f70211dfac3412313b02054e','1783e4aaf70211df9acd12313b02054e','178437e8f70211df8b8512313b02054e', '1783f54ef70211df81e012313b02054e','178415c4f70211df8f8112313b02054e' ) order by date_time desc limit 25 ------------------------- If the LIMIT was removed, the query took 60 milliseconds! If the sorting order was changed to ASC, the query took 44ms, even with the LIMIT. Then I tried to create the index on date_time in DESC order (because the result is sorted in descending order), but that did not change anything. Then I removed the index on date_time with the following results: query with the limit: 40 ms query without the limit: 60 ms => the optimizer seems to use a wrong index (I did perform an ANALYZE on newsfeed_item and a REINDEX before I did the test). Since I currently don't need the index on date_time (but will need it in the near future), I removed the index on date_time, which is ok for now. ------------------------ here are the explain analyze results: 1) the query in descending order with the limit and index on date_time (the slow one): Limit (cost=0.00..980.09 rows=25 width=963) (actual time=48.592..4060.779 rows=25 loops=1) -> Index Scan Backward using "IDX_NFI_DATETIME" on newsfeed_item (cost=0.00..409365.16 rows=10442 width=963) (actual time=48.581..4060.542 rows=25 loops=1) Filter: ((newsfeed)::text = ANY ('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[])) Total runtime: 4060.959 ms 2) the query in descending order without the limit (which is much faster): Sort (cost=39575.23..39601.33 rows=10442 width=963) (actual time=15.014..17.038 rows=477 loops=1) Sort Key: date_time Sort Method: quicksort Memory: 287kB -> Bitmap Heap Scan on newsfeed_item (cost=421.41..34450.72 rows=10442 width=963) (actual time=0.644..12.601 rows=477 loops=1) Recheck Cond: ((newsfeed)::text = ANY ('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[])) -> Bitmap Index Scan on idx_nfi_newsfeed (cost=0.00..418.80 rows=10442 width=0) (actual time=0.555..0.555 rows=477 loops=1) Index Cond: ((newsfeed)::text = ANY ('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[])) Total runtime: 19.065 ms 3) the query in ascending order with the limit (which is fast): Limit (cost=0.00..980.09 rows=25 width=963) (actual time=0.261..3.704 rows=25 loops=1) -> Index Scan using "IDX_NFI_DATETIME" on newsfeed_item (cost=0.00..409365.16 rows=10442 width=963) (actual time=0.250..3.495 rows=25 loops=1) Filter: ((newsfeed)::text = ANY ('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[])) Total runtime: 3.854 ms 4) The query after removing the index on date_time, in descending order with the LIMIT (which is fast as well). Limit (cost=34745.39..34745.45 rows=25 width=963) (actual time=12.855..13.143 rows=25 loops=1) -> Sort (cost=34745.39..34771.49 rows=10442 width=963) (actual time=12.846..12.946 rows=25 loops=1) Sort Key: date_time Sort Method: top-N heapsort Memory: 40kB -> Bitmap Heap Scan on newsfeed_item (cost=421.41..34450.72 rows=10442 width=963) (actual time=0.622..9.936 rows=477 loops=1) Recheck Cond: ((newsfeed)::text = ANY ('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[])) -> Bitmap Index Scan on idx_nfi_newsfeed (cost=0.00..418.80 rows=10442 width=0) (actual time=0.543..0.543 rows=477 loops=1) Index Cond: ((newsfeed)::text = ANY ('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[])) Total runtime: 13.318 ms Is there anything I can do to add the index on date_time without the performance problem? regards Dieter -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance