Hi all Firstly, I appreciate that my index problems are fairly difficult to debug given that I can't upload the data anywhere (it's commercially sensitive); I tried creating an equivalent dataset for my last problem using a lot of random() inserts, but unfortunately, even though the sizes and index cardinality seemed similar, it didn't exhibit the same problem, which leaves me a bit stuck. I now have (what seems to me to be) an utterly bizarre situation where postgres is using the "wrong" index, to the extent where I can't even begin to comprehend why it would do so. http://explain.depesz.com/s/uF4L # EXPLAIN (ANALYZE,BUFFERS) SELECT MIN(sc_id) FROM legs WHERE scdate BETWEEN 20160219 AND 20160221; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=25.54..25.55 rows=1 width=0) (actual time=25337.593..25337.594 rows=1 loops=1) Buffers: shared hit=2976790 read=152188 InitPlan 1 (returns $0) -> Limit (cost=0.43..25.54 rows=1 width=4) (actual time=25337.579..25337.587 rows=1 loops=1) Buffers: shared hit=2976790 read=152188 -> Index Scan using legs_sc_id_idx on legs (cost=0.43..361498.49 rows=14394 width=4) (actual time=25337.578..25337.578 rows=1 loops=1) Index Cond: (sc_id IS NOT NULL) Filter: ((scdate >= 20160219) AND (scdate <= 20160221)) Rows Removed by Filter: 4068865 Buffers: shared hit=2976790 read=152188 Planning time: 0.235 ms Execution time: 25337.620 ms (12 rows) Time: 25338.375 ms There is an index on scdate,sc_id that (I would have thought) should be ideal for this query but it's ignored. sc_id has no null values - it's even _defined_ as NOT NULL. I've no idea why the planner would think that it needs to use the sc_id index on this query. If I create an index on sc_id,scdate, that one is used (index-only scan) and the query returns in 200ms or so. http://explain.depesz.com/s/3qNC =# EXPLAIN (ANALYZE,BUFFERS) SELECT MIN(sc_id) FROM legs WHERE scdate BETWEEN 20160219 AND 20160221; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=7.32..7.33 rows=1 width=0) (actual time=207.194..207.194 rows=1 loops=1) Buffers: shared hit=1 read=11120 InitPlan 1 (returns $0) -> Limit (cost=0.43..7.32 rows=1 width=4) (actual time=207.187..207.187 rows=1 loops=1) Buffers: shared hit=1 read=11120 -> Index Only Scan using legs_sc_id_scdate_idx on legs (cost=0.43..99204.99 rows=14394 width=4) (actual time=207.185..207.185 rows=1 loops=1) Index Cond: ((sc_id IS NOT NULL) AND (scdate >= 20160219) AND (scdate <= 20160221)) Heap Fetches: 0 Buffers: shared hit=1 read=11120 Planning time: 0.236 ms Execution time: 207.223 ms I'm utterly at a loss. There are only 427 distinct scdate values on this table, but 4 million sc_id values (and the spread across scdate is reasonably similar - between 6000 and 11000 for each), so using an index on (just) sc_id makes absolutely no sense (I would expect it to be slower than a tablescan, no?). I also don't see how sc_id,scdate is more useful than scdate,sc_id. Have I completely misunderstood how this is all meant to work? I tried reading the documentation around understanding EXPLAIN and the slow query questions in the FAQ/Wiki but what I read didn't really seem to suggest any investigative steps other than "RUN ANALYZE and VACUUM" - is there a good doc on how to go about debugging this kind of thing? Or even on how the planner makes its decisions? I'm currently at the point where I'm just throwing random indexes at tables in the vain hope that it might help. I'm fairly sure that that's suboptimal :) As before, pg9.5.1, CentOS 6 x64, 4GB RAM, Xeon X3220. effective_cache_size is set to 3GB (but changing it wildly up or down doesn't change anything), shared_buffers is 1GB, work_mem is 5242kB (but changing to anything up to 1GB makes no difference). Thanks Geoff -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general