Thanks Jeff (and Tom)
On Wed, Jan 7, 2015 at 3:34 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
On Wed, Jan 7, 2015 at 3:00 PM, Pawel Veselov <pawel.veselov@xxxxxxxxx> wrote:Hi.I was wondering how come there is such a drastic difference between finding max and min. Seems like "index scan backwards" is really bad... The table is freshly re-indexed just in case. I added a count(*) in there, forcing the seq scan, and it's even better than the backwards index scan...db=> EXPLAIN ANALYZE select min(rowdate) from r_agrio where blockid = 4814;It crawls the data in rowdate order (either forward or reverse) until it finds the first 4814. Crawling forward it finds 4814 very early. Crawling backwards it has to pass through a bunch of non-4814 before it finds the first 4814.This fact doesn't show up in your EXPLAIN ANALYZE, but if you used a more modern version of postgresql (9.2 or above) there would be another line for "Rows Removed by Filter:" which would tell the story of what is going on.
Yeah, there is 10x more rows on when going backwards
If you have a composite index on (blockid, rowdate), it would help make this much faster, as it can go directly to the desired row.
That does help a lot. So, when does postgres use a more-dimensional index, even if not all dimensions are engaged (as there is an index that involves those 2 fields, and more)? I definitely see it do that in some cases...
Even with that index, however, there is still a good difference in time (the interest is theoretical at this point, as I found a better way to extract that data anyway).
On a newer db.
db=> EXPLAIN analyze select min(rowdate) from r_agrio where blockid = 4814;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=521.54..521.55 rows=1 width=0) (actual time=39.770..39.770 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.55..521.54 rows=1 width=13) (actual time=39.765..39.766 rows=1 loops=1)
-> Index Scan using rowdate_r_agrio on r_agrio (cost=0.55..303738.47 rows=583 width=13) (actual time=39.763..39.763 rows=1 loops=1)
Index Cond: ((rowdate)::text IS NOT NULL)
Filter: (blockid = 4814::numeric)
Rows Removed by Filter: 37246
Total runtime: 39.798 ms
(8 rows)
db=> EXPLAIN analyze select max(rowdate) from r_agrio where blockid = 4814;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=521.54..521.55 rows=1 width=0) (actual time=1497.377..1497.378 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.55..521.54 rows=1 width=13) (actual time=1497.371..1497.372 rows=1 loops=1)
-> Index Scan Backward using rowdate_r_agrio on r_agrio (cost=0.55..303738.47 rows=583 width=13) (actual time=1497.370..1497.370 rows=1 loops=1)
Index Cond: ((rowdate)::text IS NOT NULL)
Filter: (blockid = 4814::numeric)
Rows Removed by Filter: 317739
Total runtime: 1497.407 ms
(8 rows)
db=> CREATE INDEX concurrently xxx on r_agrio(rowdate,blockid);
CREATE INDEX
db=> EXPLAIN analyze select min(rowdate) from r_agrio where blockid = 4814;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Result (cost=85.05..85.06 rows=1 width=0) (actual time=17.585..17.585 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.43..85.05 rows=1 width=13) (actual time=17.580..17.581 rows=1 loops=1)
-> Index Only Scan using xxx on r_agrio (cost=0.43..37827.09 rows=447 width=13) (actual time=17.578..17.578 rows=1 loops=1)
Index Cond: ((rowdate IS NOT NULL) AND (blockid = 4814::numeric))
Heap Fetches: 0
Total runtime: 17.616 ms
(7 rows)
db=> EXPLAIN analyze select max(rowdate) from r_agrio where blockid = 4814;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=85.04..85.05 rows=1 width=0) (actual time=89.141..89.142 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.43..85.04 rows=1 width=13) (actual time=89.135..89.136 rows=1 loops=1)
-> Index Only Scan Backward using xxx on r_agrio (cost=0.43..37823.09 rows=447 width=13) (actual time=89.134..89.134 rows=1 loops=1)
Index Cond: ((rowdate IS NOT NULL) AND (blockid = 4814::numeric))
Heap Fetches: 1
Total runtime: 89.173 ms
(7 rows)
Cheers,