> a9-dev=> explain analyze select * from records where source_id ='http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml' order by id limit 200; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..489.57 rows=200 width=1126) (actual time=99701.773..99703.858 rows=200 loops=1) > -> Index Scan using records_pkey on records (cost=0.00..2441698.81 rows=997489 width=1126) (actual time=99684.878..99686.936 rows=200 loops=1) > Filter: ((source_id)::text = 'http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml'::text) > Total runtime: 99705.916 ms > (4 rows) > > a9-dev=> explain analyze select * from records2 where source_id ='http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml' order by id2 limit 200; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..770.01 rows=200 width=1124) (actual time=0.076..0.205 rows=200 loops=1) > -> Index Scan using source_id2_id2_id2x on records2 (cost=0.00..3735751.15 rows=970308 width=1124) (actual time=0.074..0.180 rows=200 loops=1) > Index Cond: ((source_id)::text = 'http://www.wbc.poznan.pl/dlibra/oai-pmh-repository.xml'::text) > Total runtime: 0.235 ms > (4 rows) > > > First one uses records_pkey, and with estimated cost 2441698.81 runs in over 1,5 minute. > Second uses index on (source_id, id) and with estimated cost 3735751.15 runs in 235 miliseconds. > > IMHO, problem lies not in records distribution nor normalization, but in planner's wrong cost estimation. I don't know to tell/force him to use proper index. Getting information on your current configuration should help. Please see http://wiki.postgresql.org/wiki/Slow_Query_Questions You should take care of the cache effect of your queries between your tests, here it is not a problem, but this explain was way longer for this similar query. a9-dev=> explain analyze select * from records where source_id ='http://ebuw.uw.edu.pl/dlibra/oai-pmh-repository.xml' order by id limit 200; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..757.51 rows=200 width=1126) (actual time=43.648..564.798 rows=200 loops=1) -> Index Scan using source_id_id_idx on records (cost=0.00..1590267.66 rows=419868 width=1126) (actual time=43.631..564.700 rows=200 loops=1) Index Cond: ((source_id)::text = 'http://ebuw.uw.edu.pl/dlibra/oai-pmh-repository.xml'::text) Total runtime: 564.895 ms -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance