> -----Original Message----- > From: pgsql-general-owner@xxxxxxxxxxxxxx [mailto:pgsql-general- > owner@xxxxxxxxxxxxxx] On Behalf Of Tom Lane > Sent: Friday, July 20, 2012 6:51 PM > To: David Johnston > Cc: pgsql-general@xxxxxxxxxxxxxx > Subject: Re: A Better Way? (Multi-Left Join Lookup) > > "David Johnston" <polobo@xxxxxxxxx> writes: > >> From: Tom Lane [mailto:tgl@xxxxxxxxxxxxx] Surely not. Neither merge > >> nor hash joins require an index. What plan is getting selected? > > > I have attached a scrubbed query and explain/analyze. Let me know if > > something more is needed. > > Well, here's your problem: > > > CTE master_listing {# The LEFT side of the multi-joins #} > > -> Subquery Scan on call (cost=22762.65..22762.94 rows=1 > > width=32) (actual time=619.158..735.559 rows=8656 loops=1) > > The planner thinks master_listing will return only one row, which would > make a nestloop the right way to do things. However, with 8500 rows coming > out, the nestloop iterates 8500 times and takes forever. > > So what you need to do is figure out why that rowcount estimate is so far off > and do whatever's needful to make it better. It does not have to be dead on > --- even an estimate of a few dozen rows would likely be enough to > discourage the planner from using a nestloop. > > You haven't shown enough info for anybody else to guess exactly why the > rowcount estimate is bad, though. > > regards, tom lane > OK. So, EXPLAIN SELECT function_call(...) -- yields a planner expectation of 1 row [Whereas] EXPLAIN SELECT * FROM function_call(...) -- yields a planner expectation of "result_rows" which defaults to 1000 The syntax: SELECT function_call(field_on_another_relation) FROM another_relation Is convenient in order to avoid... SELECT * FROM function_call( (SELECT field_on_another_relation FROM another_relation) ); ...especially when you need multiple fields from "another_relation" I guess I get the idea that a function used "inline" is generally going to return a single result and so the estimate of "1" is most probable. May I suggest, then, that the CREATE FUNCTION documentation for "ROWS result_rows" be modified: Something like: "The default assumption is 1,000 rows if the function is called in the FROM clause of a query. If it is called anywhere else (e.g., the Select List) the assumption is 1 row regardless of an explicit or default ROWS estimate." Was this an intentional design decision to override the result_rows estimate of the function if it is used in the select list? I get the general reasoning behind it and do not know enough regarding the internals to make a judgement but if intentional could it maybe be a little smarter as to when the override occurs? Obviously the ideal solution is to implement LATERAL... FYI: I included the following section in the query I provided because I suspected the function call may have been the issue... , master_listing AS ( SELECT -- identifier fields FROM ( SELECT (func).* FROM ( SELECT fuction_generating_8500_records(...) --<<<<< Use of function in Select List; results in the 1-row estimate. 1,000 rows triggers the "MERGE LEFT JOIN" plan ) func FROM scenario_info ) call ) master (function_column_rename) ) Thank You! David J. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general