On Fri, Mar 1, 2013 at 3:59 AM, Chris Hanks <christopher.m.hanks@xxxxxxxxx> wrote: > On Fri, Mar 1, 2013 at 1:21 AM, Raymond O'Donnell <rod@xxxxxx> wrote: >> >> On 01/03/2013 00:19, Chris Hanks wrote: >> > On Thu, Feb 28, 2013 at 6:00 AM, Tom Lane <tgl@xxxxxxxxxxxxx >> > <mailto:tgl@xxxxxxxxxxxxx>> wrote: >> > >> > Chris Hanks <christopher.m.hanks@xxxxxxxxx >> > <mailto:christopher.m.hanks@xxxxxxxxx>> writes: >> > > create or replace view values_view as >> > > select fkey1, fkey3, >> > > (derived1 / max(derived1) over (partition by fkey1)) as >> > derived1, >> > > (derived2 / sum(derived1) over (partition by fkey1)) as derived2 >> > > from ( >> > > select fkey1, fkey3, >> > > cast(sum((case when (value > 0.0) then 4 else 1 end)) as >> > double >> > > precision) as derived1, >> > > sum((case when (value > 0.0) then (value * 4) else (value + 1) >> > end)) as >> > > derived2 >> > > from values >> > > group by fkey1, fkey3 >> > > ) as t1; >> > >> > > -- This query requires a sequential scan on values, though all the >> > data it >> > > needs could be found much more efficiently with an index scan. >> > > explain analyze select * from values_view where fkey1 = 1263; >> > >> > To use the outer WHERE clause as an index constraint, postgres would >> > have to prove that scanning only the rows with fkey1 = 1263 would >> > still >> > find all the rows that would get examined by the window functions >> > --- >> > and in this case, it's not only the window functions that make that >> > less >> > than obvious, but the grouped aggregates in the sub-select below >> > them. >> > There's not nearly that amount of intelligence in the system about >> > window functions, as yet. So you'll have to write out the query >> > longhand and put the WHERE clause at the lower level, if you want >> > this >> > optimization to happen. >> > >> > regards, tom lane >> > >> > >> > Ok, that makes sense, thanks. >> > >> > Can anyone point me to an example of wrapping a function in a view, like >> > Merlin suggested? I'm not sure how that would work. >> >> Off the top of my head, I'd imagine it's as simple as: >> >> create view ... as >> select * from my_function(...); >> >> :-) >> >> Ray. >> >> >> -- >> Raymond O'Donnell :: Galway :: Ireland >> rod@xxxxxx > > > > Sorry, I don't understand. I'm able to make a function that takes an integer > and uses it in the subselect as "WHERE fkey1 = arg", and that works as I > expect it to and it's plenty fast. But I don't see how to write a view to > take advantage of this function - what arguments would go in > my_function(...) when I'm declaring the view? First let's clearly state the problem - create some test data: postgres=# create table foo as select v1, v2 from (select generate_series(1,10) v1) q1, (select generate_series(1,10000) v2) q2; postgres=# create table bar as select v1 from generate_series(1,10) v1; foo is data table, bar defines the key around which we have the data. create an index: postgres=# create index on foo(v1); simple lookup is optimized obviously: postgres=# explain select * from foo where v1 = 7; QUERY PLAN ---------------------------------------------------------------------- Index Scan using foo_v1_idx on foo (cost=0.00..4.38 rows=1 width=8) Index Cond: (v1 = 50) get indexed lookup for simple window function: postgres=# explain select v1, sum(v2) over(partition by v1 order by v2) from foo where v1 = 7; QUERY PLAN ---------------------------------------------------------------------------------- WindowAgg (cost=4.39..4.41 rows=1 width=8) -> Sort (cost=4.39..4.39 rows=1 width=8) Sort Key: v2 -> Index Scan using foo_v1_idx on foo (cost=0.00..4.38 rows=1 width=8) Index Cond: (v1 = 50) push to subquery (essentially what view does) and we lose the optimization: postgres=# explain select * from (select v1, sum(v2) over(partition by v1 order by v2) from foo) q where v1 = 7; QUERY PLAN ----------------------------------------------------------------------------- Subquery Scan on q (cost=9747.82..12997.82 rows=1 width=12) Filter: (q.v1 = 50) -> WindowAgg (cost=9747.82..11747.82 rows=100000 width=8) -> Sort (cost=9747.82..9997.82 rows=100000 width=8) Sort Key: foo.v1, foo.v2 -> Seq Scan on foo (cost=0.00..1443.00 rows=100000 width=8) let's work around it! postgres=# CREATE OR REPLACE FUNCTION getfoovals( v1 INOUT int, sumv2 OUT BIGINT) RETURNS SETOF RECORD AS $$ select v1, sum(v2) over(partition by v1 order by v2) from foo where foo.v1 = getfoovals.v1; $$ LANGUAGE SQL STABLE; select * from getfoovals(7); v1 | sumv2 ----+---------- 7 | 1 7 | 3 7 | 6 7 | 10 7 | 15 7 | 21 <snip> abstract to view: postgres=# CREATE OR REPLACE VIEW foovals AS select v1, (getfoovals(v1)).sumv2 from bar; postgres=# select * from foovals where v1 = 7; Main problem with this technique is awkwardness around using column lest set returning function if it returns > 1 column...if you'r not careful you can get extra invocations of function. you work hack around this to some degree via: *) offset 0 hacks *) return type coersion hacks *) 9.3 LATERAL feature completely nails it Can't use WITH because it fences of the optimization. This (window function optimization fencing) is probably #1 performance gotcha I hit in everyday coding after LATERAL and lack of better ability to inline simple SQL functions. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general