On Thu, Feb 14, 2013 at 7:23 AM, Jan Strube <js@xxxxxxxxx> wrote: > Hi, > > I have the following query which runs reasonably fast under PostgreSQL > 9.1.8: > > SELECT > b."ISIN", > CASE > WHEN b."COMMENT" IS NOT NULL THEN b."COMMENT" > WHEN cc."ISIN" IS NOT NULL THEN cc.comment > ELSE get_comment(b."ISIN") > END AS "COMMENT" > FROM dtng."Z_BASE" b > LEFT JOIN dtng.cached_comments cc on b."ISIN" = cc."ISIN" AND cc.cache_time >>= b._last_modified > WHERE b."ISIN" IN (SELECT "ISIN" from dtng."Z_BASE" LIMIT 1) > > Here is the query plan: > > Nested Loop Left Join (cost=0.08..16.65 rows=1 width=1053) > Join Filter: (cc.cache_time >= b._last_modified) > -> Nested Loop (cost=0.08..8.67 rows=1 width=644) > -> HashAggregate (cost=0.08..0.09 rows=1 width=13) > -> Subquery Scan on "ANY_subquery" (cost=0.00..0.08 rows=1 width=13) > -> Limit (cost=0.00..0.07 rows=1 width=13) > -> Seq Scan on "Z_BASE" (cost=0.00..106515.68 rows=1637368 width=13) > -> Index Scan using "Z_BASE_pkey" on "Z_BASE" b (cost=0.00..8.57 rows=1 > width=644) > Index Cond: (("ISIN")::bpchar = ("ANY_subquery"."ISIN")::bpchar) > -> Index Scan using cached_comments_pkey on cached_comments cc > (cost=0.00..7.71 rows=1 width=425) > Index Cond: ((b."ISIN")::bpchar = ("ISIN")::bpchar) > > When I´m trying to put this into a view, it becomes extremely slow: > > CREATE VIEW export_comments AS > SELECT > b."ISIN", > CASE > WHEN b."COMMENT" IS NOT NULL THEN b."COMMENT" > WHEN cc."ISIN" IS NOT NULL THEN cc.comment > ELSE get_comment(b."ISIN") > END AS "COMMENT" > FROM dtng."Z_BASE" b > LEFT JOIN dtng.cached_comments cc on b."ISIN" = cc."ISIN" AND cc.cache_time >>= b._last_modified > > SELECT * > FROM export_comments > WHERE "ISIN" IN (SELECT "ISIN" FROM dtng."Z_BASE" LIMIT 1) > > The query plan now is: > > Hash Join (cost=79926.52..906644.87 rows=818684 width=45) > Hash Cond: ((b."ISIN")::bpchar = ("ANY_subquery"."ISIN")::bpchar) > -> Hash Left Join (cost=79926.42..884049.08 rows=1637368 width=1053) > Hash Cond: ((b."ISIN")::bpchar = (cc."ISIN")::bpchar) > Join Filter: (cc.cache_time >= b._last_modified) > -> Seq Scan on "Z_BASE" b (cost=0.00..106515.68 rows=1637368 width=644) > -> Hash (cost=74620.41..74620.41 rows=77841 width=425) > -> Seq Scan on cached_comments cc (cost=0.00..74620.41 rows=77841 width=425) > -> Hash (cost=0.09..0.09 rows=1 width=13) > -> HashAggregate (cost=0.08..0.09 rows=1 width=13) > -> Subquery Scan on "ANY_subquery" (cost=0.00..0.08 rows=1 width=13) > -> Limit (cost=0.00..0.07 rows=1 width=13) > -> Seq Scan on "Z_BASE" (cost=0.00..106515.68 rows=1637368 width=13) > > By the way I get the same behaviour and query plan when I try this: > > SELECT * > FROM ( > -- above view definition > ) x > WHERE x."ISIN" IN (SELECT "ISIN" FROM dtng."Z_BASE" LIMIT 1) > > We already found out that the problem is the Perl function "get_comment" > which is very expensive. In the first case the function is called at most > once, but in the second case it is called many times. I believe this is > because of the hash join which causes the view to fetch everything from > dtng."Z_BASE" first? > The question is, how to avoid this? We tried to set the functions cost from > 100 to 10000000 but that did not help. (Because of the architecture of the > software that uses this query, we have the constraint that structure of the > final WHERE clause (WHERE "ISIN" IN (...)) must not be altered.) > > Thanks a lot for any idea, > Jan is your function stable/immutable, and if so is it decorated as such. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general