On Sat, May 18, 2013 at 1:47 AM, Jeff Amiel <becauseimjeff@xxxxxxxxx> wrote: > > > > > ----- Original Message ----- > From: Amit Langote <amitlangote09@xxxxxxxxx> > To: Jeff Amiel <becauseimjeff@xxxxxxxxx> > Cc: "pgsql-general@xxxxxxxxxxxxxx" <pgsql-general@xxxxxxxxxxxxxx> > Sent: Friday, May 17, 2013 11:37 AM > Subject: Re: Why does row estimation on nested loop make no sense to me > > On Sat, May 18, 2013 at 1:25 AM, Jeff Amiel <becauseimjeff@xxxxxxxxx> wrote: >> On most nested loops that I do explain/explain analyze on, the row estimation for the nested-loop itself is a product of the inner nodes of the nested loop. >> However in this case, I am stumped! >> >> explain >> select era.child_entity from entity_rel era join user_entity ue on ue.entity_id = era.parent_entity and ue.user_id=12345 >> >> Nested Loop (cost=0.00..2903.37 rows=29107 width=4) >> -> Index Only Scan using entity_pk on user_entity ue (cost=0.00..62.68 rows=2 width=4) >> Index Cond: (user_id = 10954) >> -> Index Scan using rel_parent on entity_rel era (cost=0.00..1261.85 rows=317 width=8) >> Index Cond: (parent_entity = ue.entity_id) >> >> >> How can the estimated number of rows for the nested loop node EXCEED the product of the 2 row estimates of the tables being joined? >> Not only does it exceed it - but it is orders of magnitude greater. >> >> Am I missing something obvious here? I an see the nested loop row estimate being LESS but certainly not more. >> > >> Can you also post the output of explain analyze <your-query>? > > I'm not worried about performance (per se) but the row estimation issue which propagates up as part of a bigger query. But here ya go: > > explain analyze > select era.child_entity from entity_rel era join user_entity ue on ue.entity_id = era.parent_entity and ue.user_id=12345 > > Nested Loop (cost=0.00..2903.37 rows=29107 width=4) (actual time=0.028..0.274 rows=201 loops=1) > -> Index Only Scan using entity_pk on user_entity ue (cost=0.00..62.68 rows=2 width=4) (actual time=0.011..0.012 rows=1 loops=1) > Index Cond: (user_id = 12345) > Heap Fetches: 1 > -> Index Scan using rel_parent on entity_rel era (cost=0.00..1261.85 rows=317 width=8) (actual time=0.013..0.164 rows=201 loops=1) > Index Cond: (parent_entity = ue.entity_id) > Total runtime: 0.361 ms Have you tried analyze (it's probably a case of insufficient/outdated statistics to planner's disposal) or probably consider changing default_statistics_target? -- Amit Langote -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general