Search Postgresql Archives

Re: plpgsql functions organisation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/03/2015 07:14 AM, Melvin Davidson wrote:
The point was to show that yes, function calls take time, and using sub
functions take even more time. I am not about to write an additional
more detailed example just to show the same results. If you are in
doubt, I respectfully suggest you do your own testing.

Can't resist a challenge. I took an existing function that calculates an aggregated attendance count for a student or all enrolled students over a date period and modified it to call sub functions. There are two sub functions, one that finds the students enrolled over a period(which by the way calls another function) and dates range they where enrolled. The other calculates the aggregate values for each student. The original function is student_attendance, the modified student_attendance_sub. The results are below, where the first argument is the student_id(where 0 equals all students). The all students version returns 600 rows, the single student 16 rows.


hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.204865s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.014101s
hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.041182s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.011385s
hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.040762s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.016506s

hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.00291s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.004125s
hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.001907s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.003476s
hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.00597s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.003986s

Definite difference in the all students run, probably because one of the called functions is used in a LOOP and caching applies.



On Sun, May 3, 2015 at 5:26 AM, Alban Hertroys <haramrae@xxxxxxxxx
<mailto:haramrae@xxxxxxxxx>> wrote:


    > On 03 May 2015, at 2:56, Melvin Davidson <melvin6925@xxxxxxxxx <mailto:melvin6925@xxxxxxxxx>> wrote:
    >
    > OK, Here is a simple example that shows the difference between using a self contained function  and
    > one that calls sub functions.
    >
    > After loading all the functions below, repeat each of the EXPLAIN statements a few times and note that
    > callsubs takes almost TWICE as long to execute as nosub.
    >
    > CREATE OR REPLACE FUNCTION nosub(text)
    >   RETURNS void AS
    > $BODY$
    ...
    >     IF LENGTH(p_in_str) <= 6
    >         THEN RAISE NOTICE 'Hi %', p_in_str;
    >     ELSE
    >         RAISE NOTICE 'Hello %', p_in_str;
    >     END IF;
    >
    >     RETURN;
    > END;
    > $BODY$
    …


    >  CREATE OR REPLACE FUNCTION called1(text)
    >   RETURNS void AS
    > $BODY$
    ...
    >     RAISE NOTICE 'Hi %', p_in_str1;
    >
    >     RETURN;
    > END;
    > $BODY$
    …

    >  CREATE OR REPLACE FUNCTION called2(text)
    >   RETURNS void AS
    > $BODY$
    ...
    >     RAISE NOTICE 'Hello %', p_in_str2;
    >
    >     RETURN;
    > END;
    ...


    That's a rather uninteresting experiment, as all it does is call a
    function and raise a notice. Relative to what the functions do, the
    function call itself takes a significant amount of time. No surprise
    there, you'll see something similar in any language, even C. All
    you're showing is that calling a function takes some amount of time > 0.

    In C, a function call needs to look up an address to jump to, in
    plpgsql the database needs to look up the function body in a table.
    If the function is small and atomic it often gets called from
    multiple other functions and is probably cached anyway. The main
    difference between C and plpgsql here is that the latter is an
    interpreted language, so it does need to read in the entire function
    body after a call - which I'd expect to be quite a bit faster with a
    smaller (atomic) function body, especially when it hasn't been
    cached yet.

    So far I haven't been convinced.

    An actual use-case where the functions actually do something would
    be far more interesting. I doubt anybody writes functions just to
    raise a notice. I expect that in reality most plpgsql functions
    perform database queries and do something with the result. In such
    cases, function call overhead could be significant if the call is
    done for each record in a result set, for example. And even then
    it's worth considering whether that matters to your situation enough
    that it outweighs the usual benefits of code separation.


     > On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver
    <adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>> wrote:
     > On 05/02/2015 03:28 PM, Bill Moran wrote:
     > On Sat, 02 May 2015 15:06:24 -0700
     > Adrian Klaver <adrian.klaver@xxxxxxxxxxx
    <mailto:adrian.klaver@xxxxxxxxxxx>> wrote:
     >
     > On 05/02/2015 02:07 PM, Jeff Janes wrote:
     > On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver
    <adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>
     > <mailto:adrian.klaver@xxxxxxxxxxx
    <mailto:adrian.klaver@xxxxxxxxxxx>>> wrote:
     >
     >      On 05/02/2015 10:12 AM, Melvin Davidson wrote:
     >
     >          AFAIK, you cannot "package" functions in  PostgreSQL,
    but it is
     >          possible to
     >          call a function from within a function.
     >
     >          That being said, I would seriously look at how and why
    you are
     >          writing
     >          your functions
     >          as functions that call other functions are not very
    efficient.
     >
     >
     >      I am not following. That is what packaging is about,
    separating out
     >      'units of work' so they can be combined as needed. Part of
    that is
     >      using existing functions in new functions/classes. In fact
    in the
     >      Postgres source I see this in many places. Now it is entirely
     >      possible I missed a memo, so I am open to a more detailed
     >      explanation of the inefficiencies involved.
     >
     >
     > The Postgres source is written in C, not in plpgsql.  C has a good
     > optimizing compiler and plpgsql doesn't.
     >
     > Does this actually matter?  I am a biologist that backed into
    computing,
     > so I realize I am weak on the fundamentals. Still the scientist in me
     > wants data backing assertions. As I understand it plpgsql works
    close to
     > the server and is optimized to do so.  I know writing in C would be a
     > better solution. Still is calling plpgsql functions inside plpgsql
     > really a bad thing when just considering plpgsql?
     >
     > The answer to that is the same answer to so many other things: it
    depends.
     >
     > plpgsql functions are slower than C. They also lack a lot of language
     > features that C has. That being said, if they're meeting your
    needs, then
     > don't worry about it. plpgsql is around because for most people,
    it works
     > well enough. There are certainly cases when you want to create
    very complex
     > logic in the database and plpgsql is liable to make that
    difficult. But
     > there are a lot of cases where having to manage pointers and a build
     > environment and all the things that go with C aren't justified,
    because
     > plpgsql has none of that complexity. There are advantages both ways.
     >
     > The beauty of PostgreSQL is that you have both available and you
     > can choose whichever is best for your situation.
     >
     > Agreed, though in my case I drop into plpythonu when I want more
    complex solutions.
     >
     >
     >
     >
     > --
     > Adrian Klaver
     > adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>
     >
     >
     >
     > --
     > Melvin Davidson
     > I reserve the right to fantasize.  Whether or not you
     > wish to share my fantasy is entirely up to you.

    Alban Hertroys
    --
    If you can't see the forest for the trees,
    cut the trees and you'll find there is no forest.




--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux