On Wednesday 19 August 2009 11:17:26 Scott Carey wrote: > On 8/19/09 9:28 AM, "Kevin Kempter" <kevink@xxxxxxxxxxxxxxxxxxx> wrote: > > Hi all; > > > > we've been fighting this query for a few days now. we bumped up the > > statistict target for the a.id , c.url_hits_id and the b.id columns below > > to 250 and ran an analyze on the relevant tables. we killed it after > > 8hrs. > > > > Note the url_hits table has > 1.4billion rows > > > > Any suggestions? > > Have you tried setting work_mem higher for just this query? Yes, we upped it to 500Meg > > The big estimated cost is the sequential scan on url_hits. But in reality, > if the estimates are off the sort and index scan at the end might be your > bottleneck. Larger work_mem might make it choose another plan there. > > But if the true cost is the sequential scan on url_hits, then only an index > there will help. > > > $ psql -ef expl.sql pwreport > > explain > > select > > a.id, > > ident_id, > > time, > > customer_name, > > extract('day' from timezone(e.name, to_timestamp(a.time))) as day, > > category_id > > from > > pwreport.url_hits a left outer join > > pwreport.url_hits_category_jt c on (a.id = c.url_hits_id), > > pwreport.ident b, > > pwreport.timezone e > > where > > a.ident_id = b.id > > and b.timezone_id = e.id > > and time >= extract ('epoch' from timestamp '2009-08-12') > > and time < extract ('epoch' from timestamp '2009-08-13' ) > > and direction = 'REQUEST' > > ; > > > > QUERY > > PLAN > > ------------------------------------------------------------------------- > >----- > > ------------------------------------------------------------------------- > >----- -------------------------------------------------------- > > Merge Right Join (cost=47528508.61..180424544.59 rows=10409251 > > width=53) Merge Cond: (c.url_hits_id = a.id) > > -> Index Scan using mt_url_hits_category_jt_url_hits_id_index on > > url_hits_category_jt c (cost=0.00..122162596.63 rows=4189283233 width=8) > > -> Sort (cost=47528508.61..47536931.63 rows=3369210 width=49) > > Sort Key: a.id > > -> Hash Join (cost=2565.00..47163219.21 rows=3369210 width=49) > > Hash Cond: (b.timezone_id = e.id) > > -> Hash Join (cost=2553.49..47116881.07 rows=3369210 > > width=37) > > Hash Cond: (a.ident_id = b.id) > > -> Seq Scan on url_hits a (cost=0.00..47051154.89 > > rows=3369210 width=12) > > Filter: ((direction = > > 'REQUEST'::proxy_direction_enum) AND (("time")::double precision >= > > 1250035200::double precision) AND (("time")::double precision < > > 1250121600::double precision)) > > -> Hash (cost=2020.44..2020.44 rows=42644 > > width=29) -> Seq Scan on ident b (cost=0.00..2020.44 rows=42644 > > width=29) > > -> Hash (cost=6.78..6.78 rows=378 width=20) > > -> Seq Scan on timezone e (cost=0.00..6.78 > > rows=378 width=20) > > (15 rows) > > > > > > -- > > Sent via pgsql-performance mailing list > > (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance