Since it's my first on this list, I'd like to say "Hi guys" :) Here is definition of my table: a9-dev=> \d records; Table "public.records" Column | Type | Modifiers --------------------------------------+-----------------------------+----------- id | bigint | not null checksum | character varying(32) | not null data | text | not null delete_date | timestamp without time zone | last_processing_date | timestamp without time zone | object_id | character varying(255) | not null processing_path | character varying(255) | not null schema_id | character varying(255) | not null source_id | character varying(255) | not null source_object_last_modification_date | timestamp without time zone | not null Indexes: "records_pkey" PRIMARY KEY, btree (id) "unq_records_0" UNIQUE, btree (object_id, schema_id, source_id, processing_path) "length_processing_path_id_idx" btree (length(processing_path::text), id) "length_processing_path_idx" btree (length(processing_path::text)) "object_id_id_idx" btree (object_id, id) "schema_id_id_idx" btree (schema_id, id) "schema_id_idx" btree (schema_id) "source_id_id_idx" btree (source_id, id) "source_id_idx" btree (source_id) "source_object_last_modification_date_id_idx" btree (source_object_last_modification_date, id) "source_object_last_modification_date_idx" btree (source_object_last_modification_date) Average length of value of "data" column = 2991.7947061626100466 When I perform query such as this: "select * from records where source_id = 'XXX' order by id limit 200;" I expect DB to use index source_id_id_idx with XXX as filter. It is true for all but one values of XXX - when I ask for records with most common source_id, records_pkey index is used instead and performance is terrible! Explain analyze results below. a9-dev=> explain analyze select * from records where source_id ='http://ebuw.uw.edu.pl/dlibra/oai-pmh-repository.xml' order by id limit 200; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..755.61 rows=200 width=1127) (actual time=75.292..684.582 rows=200 loops=1) -> Index Scan using source_id_id_idx on records (cost=0.00..1563542.89 rows=413849 width=1127) (actual time=75.289..684.495 rows=200 loops=1) Index Cond: ((source_id)::text = 'http://ebuw.uw.edu.pl/dlibra/oai-pmh-repository.xml'::text) Total runtime: 690.358 ms (4 rows) a9-dev=> explain analyze select * from records where source_id ='http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml' order by id limit 200; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..466.22 rows=200 width=1127) (actual time=124093.485..124095.540 rows=200 loops=1) -> Index Scan using records_pkey on records (cost=0.00..2333280.84 rows=1000937 width=1127) (actual time=124093.484..124095.501 rows=200 loops=1) Filter: ((source_id)::text = 'http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml'::text) Total runtime: 124130.247 ms (4 rows) Some info about data distrubution: a9-dev=> select min(id) from records; min -------- 190830 (1 row) a9-dev=> select min(id), max(id) from records where source_id='http://ebuw.uw.edu.pl/dlibra/oai-pmh-repository.xml'; min | max ---------+--------- 1105217 | 3811326 (1 row) a9-dev=> select min(id), max(id) from records where source_id='http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml'; min | max ---------+--------- 1544991 | 3811413 (1 row) a9-dev=> select min(id), max(id) from (select id from records where source_id = 'http://ebuw.uw.edu.pl/dlibra/oai-pmh-repository.xml' order by id limit 200) as a; min | max ---------+--------- 1105217 | 1105416 (1 row) a9-dev=> select min(id), max(id) from (select id from records where source_id = 'http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml' order by id limit 200) as a; min | max ---------+--------- 1544991 | 1545190 (1 row) a9-dev=> select source_id, count(*) from records where source_id = 'http://ebuw.uw.edu.pl/dlibra/oai-pmh-repository.xml' or source_id = 'http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml' group by source_id; source_id | count --------------------------------------------------------+-------- http://ebuw.uw.edu.pl/dlibra/oai-pmh-repository.xml | 427254 http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml | 989184 (2 rows) a9-dev=> select count(*) from records; count --------- 3620311 (1 row) DB settings: a9-dev=> SELECT a9-dev-> 'version'::text AS "name", a9-dev-> version() AS "current_setting" a9-dev-> UNION ALL a9-dev-> SELECT a9-dev-> name,current_setting(name) a9-dev-> FROM pg_settings a9-dev-> WHERE NOT source='default' AND NOT name IN a9-dev-> ('config_file','data_directory','hba_file','ident_file', a9-dev(> 'log_timezone','DateStyle','lc_messages','lc_monetary', a9-dev(> 'lc_numeric','lc_time','timezone_abbreviations', a9-dev(> 'default_text_search_config','application_name', a9-dev(> 'transaction_deferrable','transaction_isolation', a9-dev(> 'transaction_read_only'); name | current_setting --------------------------+----------------------------------------------------------------------------------------------------------------- version | PostgreSQL 9.0.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.6.0 20110530 (Red Hat 4.6.0-9), 64-bit lc_collate | en_US.UTF-8 lc_ctype | en_US.UTF-8 listen_addresses | * log_rotation_age | 1d log_rotation_size | 0 log_truncate_on_rotation | on logging_collector | on max_connections | 100 max_stack_depth | 2MB port | 5432 server_encoding | UTF8 shared_buffers | 24MB TimeZone | Poland (14 rows) This query was always slow. Autovacuum is on, and I ran VACUUM ANALYZE manually few minutes before writing this email. Please help me with my problem. I'll be happy to provide any additional information if needed. Michal Nowak -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance