Hi Thomas, Thanks for your reply. Yes, the query is absolutely same which I posted. Please suggest if something need to change in query. As Per your comment... The query you posted includes there two join conditions: evidence_to_do.project_id = tool_performance.project_id evidence_to_do.project_id = project.project_id But the plan only seems to enforce the equality between 'project' and 'tool_performance'. So when joining the evidence_to_do, it performs a cartesian product, producing ~52B rows (estimated). That can't be fast. Regards, Dinesh Chandra |Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida. ------------------------------------------------------------------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@xxxxxxxxxx Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India. -----Original Message----- From: Tomas Vondra [mailto:tomas.vondra@xxxxxxxxxxxxxxx] Sent: 25 May, 2017 9:08 PM To: Dinesh Chandra 12108 <Dinesh.Chandra@xxxxxxxxxx> Cc: pgsql-performance@xxxxxxxxxxxxxx Subject: [EXTERNAL] Re: FW: Re: Query is running very slow...... On 5/25/17 2:26 PM, Dinesh Chandra 12108 wrote: > Hi Tomas, > > Please find the below input for slow query. > > (a) something about the hardware it's running on > RAM-->64 GB, CPU->40core > > (b) amounts of data in the tables / databases > Database size :32GB > ----------------- > Tables size > ----------------- > Workflow.project : 8194 byte > workflow.tool_performance :175 MB > workflow.evidence_to_do :580 MB > > (c) EXPLAIN or even better EXPLAIN ANALYZE of the query > > "GroupAggregate (cost=16583736169.63..18157894828.18 rows=5920110 width=69)" > " -> Sort (cost=16583736169.63..16714893857.43 rows=52463075120 width=69)" > " Sort Key: tool_performance.project_id, project.project_name, tool_performance.step_id, (date_trunc('day'::text, tool_performance.insert_time)), tool_performance.user_id" > " -> Nested Loop (cost=2.42..787115179.07 rows=52463075120 width=69)" > " -> Seq Scan on evidence_to_do (cost=0.00..119443.95 rows=558296 width=0)" > " Filter: (status_id = ANY ('{15100,15150,15200,15300,15400,15500}'::bigint[]))" > " -> Materialize (cost=2.42..49843.24 rows=93970 width=69)" > " -> Hash Join (cost=2.42..49373.39 rows=93970 width=69)" > " Hash Cond: (tool_performance.project_id = project.project_id)" > " -> Seq Scan on tool_performance (cost=0.00..48078.88 rows=93970 width=39)" > " Filter: ((insert_time > '2017-05-01 00:00:00+05:30'::timestamp with time zone) AND (insert_time < '2017-05-02 00:00:00+05:30'::timestamp with time zone))" > " -> Hash (cost=1.63..1.63 rows=63 width=38)" > " -> Seq Scan on project (cost=0.00..1.63 rows=63 width=38)" > Are you sure this is the same query? The query you posted includes there two join conditions: evidence_to_do.project_id = tool_performance.project_id evidence_to_do.project_id = project.project_id But the plan only seems to enforce the equality between 'project' and 'tool_performance'. So when joining the evidence_to_do, it performs a cartesian product, producing ~52B rows (estimated). That can't be fast. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance