Since my first post in this series, data has been successfully exported from the proprietary db (in CSV format) and imported into postgres (PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2)) using COPY. The tablespace holding the tables+indexes is now 60Gb, total time for exporting+transfering over network+importing+indexing+vacuuming is about 7 hours. I now have a query (originally using an analytical function) that behaves strangely as soon as I go from 2 person_information__id:s to 3 person_information__id:s (see below): // // 2 PID:s // mica@TEST=> explain analyze select distinct on (driver_activity.person_information__id) TEST-> driver_activity.person_information__id, TEST-> driver_activity.end_time as prev_timestamp TEST-> from TEST-> driver_activity TEST-> where TEST-> driver_activity.person_information__id in (5398,5399) TEST-> and driver_activity.driver_activity_type__id = 5 TEST-> and driver_activity.start_time < '2006-04-01' TEST-> order by TEST-> driver_activity.person_information__id, driver_activity.start_time desc; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ---------------------------- Unique (cost=1909.90..1912.32 rows=1 width=20) (actual time=1.153..1.623 rows=2 loops=1) -> Sort (cost=1909.90..1911.11 rows=485 width=20) (actual time=1.151..1.389 rows=213 loops=1) Sort Key: person_information__id, start_time -> Bitmap Heap Scan on driver_activity (cost=6.91..1888.26 rows=485 width=20) (actual time=0.141..0.677 rows=213 loops=1) Recheck Cond: (((person_information__id = 5398) AND (driver_activity_type__id = 5)) OR ((person_information__id = 5399) AND (driver_activity_type__id = 5))) Filter: (start_time < '2006-04-01 00:00:00'::timestamp without time zone) -> BitmapOr (cost=6.91..6.91 rows=485 width=0) (actual time=0.102..0.102 rows=0 loops=1) -> Bitmap Index Scan on xda_pi_dat (cost=0.00..3.46 rows=243 width=0) (actual time=0.042..0.042 rows=56 loops=1) Index Cond: ((person_information__id = 5398) AND (driver_activity_type__id = 5)) -> Bitmap Index Scan on xda_pi_dat (cost=0.00..3.46 rows=243 width=0) (actual time=0.055..0.055 rows=157 loops=1) Index Cond: ((person_information__id = 5399) AND (driver_activity_type__id = 5)) Total runtime: 1.693 ms (12 rows) // // 3 PID:s // mica@TEST=> explain analyze select distinct on (driver_activity.person_information__id) TEST-> driver_activity.person_information__id, TEST-> driver_activity.end_time as prev_timestamp TEST-> from TEST-> driver_activity TEST-> where TEST-> driver_activity.person_information__id in (5398,5399,5400) TEST-> and driver_activity.driver_activity_type__id = 5 TEST-> and driver_activity.start_time < '2006-04-01' TEST-> order by TEST-> driver_activity.person_information__id, driver_activity.start_time desc; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------------------ ----------------------------------------------------------------- Unique (cost=2808.35..2811.98 rows=1 width=20) (actual time=5450.281..5450.948 rows=3 loops=1) -> Sort (cost=2808.35..2810.17 rows=727 width=20) (actual time=5450.278..5450.607 rows=305 loops=1) Sort Key: person_information__id, start_time -> Bitmap Heap Scan on driver_activity (cost=2713.82..2773.80 rows=727 width=20) (actual time=5436.259..5449.043 rows=305 loops=1) Recheck Cond: ((((person_information__id = 5398) AND (driver_activity_type__id = 5)) OR ((person_information__id = 5399) AND (driver_activity_type__id = 5)) OR ((person_information__id = 5400) AND (driver_activity_type__id = 5))) AND (driver_activity_type__id = 5)) Filter: (start_time < '2006-04-01 00:00:00'::timestamp without time zone) -> BitmapAnd (cost=2713.82..2713.82 rows=15 width=0) (actual time=5436.148..5436.148 rows=0 loops=1) -> BitmapOr (cost=10.37..10.37 rows=728 width=0) (actual time=0.384..0.384 rows=0 loops=1) -> Bitmap Index Scan on xda_pi_dat (cost=0.00..3.46 rows=243 width=0) (actual time=0.135..0.135 rows=56 loops=1) Index Cond: ((person_information__id = 5398) AND (driver_activity_type__id = 5)) -> Bitmap Index Scan on xda_pi_dat (cost=0.00..3.46 rows=243 width=0) (actual time=0.115..0.115 rows=157 loops=1) Index Cond: ((person_information__id = 5399) AND (driver_activity_type__id = 5)) -> Bitmap Index Scan on xda_pi_dat (cost=0.00..3.46 rows=243 width=0) (actual time=0.126..0.126 rows=93 loops=1) Index Cond: ((person_information__id = 5400) AND (driver_activity_type__id = 5)) -> Bitmap Index Scan on xda_dat (cost=0.00..2703.21 rows=474916 width=0) (actual time=5435.431..5435.431 rows=451541 loops=1) Index Cond: (driver_activity_type__id = 5) Total runtime: 5451.094 ms (17 rows) Question: why is the extra step (Bitmap Index Scan on xda_dat) introduced in the latter case? I can see it is originating from ((((person_information__id = 5398) AND (driver_activity_type__id = 5)) OR ((person_information__id = 5399) AND (driver_activity_type__id = 5)) OR ((person_information__id = 5400) AND (driver_activity_type__id = 5))) AND (driver_activity_type__id = 5)) ..which, indented for readability, looks like this: ( ( ( (person_information__id = 5398) AND (driver_activity_type__id = 5) ) OR ( (person_information__id = 5399) AND (driver_activity_type__id = 5) ) OR ( (person_information__id = 5400) AND (driver_activity_type__id = 5) ) ) AND (driver_activity_type__id = 5) ) ..and this last AND seems unnessecary, since the predicate on (driver_activity_type__id = 5) is included in each of the above conditions. Can this be a bug in the planner? /Mikael