My application has a function, call it "foo()", that requires initialization from a table of about 800 values. Rather than build these values into the C code, it seemed like a good idea to put them on a PG table and create a second function, call it "foo_init()", which is called for each value, like this:
select foo_init(value) from foo_init_table order by value_id;
This works well, but it requires me to actually retrieve the function's value 800 times. So I thought I'd be clever:
select count(1) from (select foo_init(value) from foo_init_table order by value_id) as foo;
And indeed, it count() returns 800, as expected. But my function foo_init() never gets called! Apparently the optimizer figures out that foo_init() must return one value for each row, so it doesn't bother to actually call the function.
db=> explain select count(1) from (select foo_init(value) from foo_init_table order by db_no) as foo;
query plan
----------------------------------------------------------------------------------------------------
aggregate (cost=69.95..69.95 rows=1 width=0)
-> Subquery Scan foo (cost=0.00..67.93 rows=806 width=0)
-> Index Scan using foo_init_table_pkey on foo_init_table (cost=0.00..59.87 rows=806 width=30)
This doesn't seem right to me -- how can the optimizer possibly know that a function doesn't have a side effect, as in my case? Functions could do all sorts of things, such as logging activity, filling in other tables, etc, etc.
Am I missing something here?
Thanks,
Craig