Hi,
I am using postgres 8.1.3 for this. If this has been dealt with later, please disregard. And this is not a complaint or a request, I am just curious, so I know how to best construct my queries.
There is a unique index mapping domains to domain_ids.
views_ts specifies a partitioned table, where views_ts_2007_04_01 is the only partition matching the range given in the query.
My goal is to produce summaries of counts of rows for each day within a given range (can be days, months, years).
The issue: the second query results in a lower cost estimate. I am wondering why the second query plan was not chosen for the first query.
Thanks!
Brian
live=> explain select ts::date,count(*) from views_ts join domains using (domain_id) where domain = '1234.com' and ts >= '2007-04-01' and ts < '2007-04-02' group by ts::date;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=9040.97..9041.00 rows=2 width=8)
-> Hash Join (cost=6.01..9040.96 rows=2 width=8)
Hash Cond: ("outer".domain_id = "inner".domain_id)
-> Append (cost=0.00..7738.01 rows=259383 width=16)
-> Seq Scan on views_ts (cost=0.00..1138.50 rows=1 width=16)
Filter: ((ts >= '2007-04-01 00:00:00+10'::timestamp with time zone) AND (ts < '2007-04-02 00:00:00+10'::timestamp with time zone))
-> Seq Scan on views_ts_2007_04_01 views_ts (cost=0.00..6599.51 rows=259382 width=16)
Filter: ((ts >= '2007-04-01 00:00:00+10'::timestamp with time zone) AND (ts < '2007-04-02 00:00:00+10'::timestamp with time zone))
-> Hash (cost=6.01..6.01 rows=1 width=8)
-> Index Scan using domains_domain on domains (cost=0.00..6.01 rows=1 width=8)
Index Cond: ("domain" = '1234.com'::text)
(11 rows)
live=> explain select ts::date,count(*) from views_ts where domain_id = (select domain_id from domains where domain = '1234.com') and ts >= '2007-04-01' and ts < '2007-04-02' group by ts::date;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1993.93..1995.99 rows=137 width=8)
InitPlan
-> Index Scan using domains_domain on domains (cost=0.00..6.01 rows=1 width=8)
Index Cond: ("domain" = '1234.com'::text)
-> Result (cost=0.00..1986.69 rows=247 width=8)
-> Append (cost=0.00..1986.07 rows=247 width=8)
-> Seq Scan on views_ts (cost=0.00..1245.75 rows=1 width=8)
Filter: ((domain_id = $0) AND (ts >= '2007-04-01 00:00:00+10'::timestamp with time zone) AND (ts < '2007-04-02 00:00:00+10'::timestamp with time zone))
-> Bitmap Heap Scan on views_ts_2007_04_01 views_ts (cost=2.86..740.32 rows=246 width=8)
Recheck Cond: (domain_id = $0)
Filter: ((ts >= '2007-04-01 00:00:00+10'::timestamp with time zone) AND (ts < '2007-04-02 00:00:00+10'::timestamp with time zone))
-> Bitmap Index Scan on views_ts_2007_04_01_domain_id (cost=0.00..2.86 rows=246 width=0)
Index Cond: (domain_id = $0)
I am using postgres 8.1.3 for this. If this has been dealt with later, please disregard. And this is not a complaint or a request, I am just curious, so I know how to best construct my queries.
There is a unique index mapping domains to domain_ids.
views_ts specifies a partitioned table, where views_ts_2007_04_01 is the only partition matching the range given in the query.
My goal is to produce summaries of counts of rows for each day within a given range (can be days, months, years).
The issue: the second query results in a lower cost estimate. I am wondering why the second query plan was not chosen for the first query.
Thanks!
Brian
live=> explain select ts::date,count(*) from views_ts join domains using (domain_id) where domain = '1234.com' and ts >= '2007-04-01' and ts < '2007-04-02' group by ts::date;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=9040.97..9041.00 rows=2 width=8)
-> Hash Join (cost=6.01..9040.96 rows=2 width=8)
Hash Cond: ("outer".domain_id = "inner".domain_id)
-> Append (cost=0.00..7738.01 rows=259383 width=16)
-> Seq Scan on views_ts (cost=0.00..1138.50 rows=1 width=16)
Filter: ((ts >= '2007-04-01 00:00:00+10'::timestamp with time zone) AND (ts < '2007-04-02 00:00:00+10'::timestamp with time zone))
-> Seq Scan on views_ts_2007_04_01 views_ts (cost=0.00..6599.51 rows=259382 width=16)
Filter: ((ts >= '2007-04-01 00:00:00+10'::timestamp with time zone) AND (ts < '2007-04-02 00:00:00+10'::timestamp with time zone))
-> Hash (cost=6.01..6.01 rows=1 width=8)
-> Index Scan using domains_domain on domains (cost=0.00..6.01 rows=1 width=8)
Index Cond: ("domain" = '1234.com'::text)
(11 rows)
live=> explain select ts::date,count(*) from views_ts where domain_id = (select domain_id from domains where domain = '1234.com') and ts >= '2007-04-01' and ts < '2007-04-02' group by ts::date;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1993.93..1995.99 rows=137 width=8)
InitPlan
-> Index Scan using domains_domain on domains (cost=0.00..6.01 rows=1 width=8)
Index Cond: ("domain" = '1234.com'::text)
-> Result (cost=0.00..1986.69 rows=247 width=8)
-> Append (cost=0.00..1986.07 rows=247 width=8)
-> Seq Scan on views_ts (cost=0.00..1245.75 rows=1 width=8)
Filter: ((domain_id = $0) AND (ts >= '2007-04-01 00:00:00+10'::timestamp with time zone) AND (ts < '2007-04-02 00:00:00+10'::timestamp with time zone))
-> Bitmap Heap Scan on views_ts_2007_04_01 views_ts (cost=2.86..740.32 rows=246 width=8)
Recheck Cond: (domain_id = $0)
Filter: ((ts >= '2007-04-01 00:00:00+10'::timestamp with time zone) AND (ts < '2007-04-02 00:00:00+10'::timestamp with time zone))
-> Bitmap Index Scan on views_ts_2007_04_01_domain_id (cost=0.00..2.86 rows=246 width=0)
Index Cond: (domain_id = $0)