On Fri, Mar 1, 2013 at 2:18 AM, Davide Berra <d.berra@xxxxxxxxxxxx> wrote: > Il 28/02/2013 18:48, Merlin Moncure ha scritto: > >> On Fri, Feb 22, 2013 at 3:21 AM, Davide Berra <d.berra@xxxxxxxxxxxx> >> wrote: >>> >>> I got a problem with the performance of a PL/PGsql stored procedure >>> outputting an xml. >>> >>> Server version: PostgreSQL 8.3.6 on i686-pc-linux-gnu, compiled by GCC >>> gcc >>> (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46) >>> CPU: Intel(R) Core(TM) i3 CPU 540 @ 3.07GHz >>> RAM installed: 4GB >>> Hard Disk: Seagate 500Gb SATA 2 >>> >>> This is a simplified content of the function showing the xmlconcat >>> behaviour. >>> >>> CREATE OR REPLACE FUNCTION test_function (v_limit int) >>> RETURNS xml AS >>> $BODY$ >>> DECLARE >>> v_xml xml; >>> BEGIN >>> >>> FOR i IN 1..v_limit LOOP >>> v_xml := xmlconcat(v_xml, xmlelement(name content, 'aaaaaaa')); >>> END LOOP; >>> >>> RETURN v_xml ; >>> END >>> $BODY$ >>> LANGUAGE 'plpgsql' SECURITY DEFINER ; >>> >>> >>> As long as the v_limit parameter grows (and then the size of the output >>> xml, >>> the time needed increase exponentially. >>> Look at this examples: >>> >>> pang=# explain analyze select test_function(1000); >>> QUERY PLAN >>> >>> -------------------------------------------------------------------------------------- >>> Result (cost=0.00..0.26 rows=1 width=0) (actual time=65.430..65.431 >>> rows=1 >>> loops=1) >>> Total runtime: 65.457 ms >>> (2 rows) >>> >>> pang=# explain analyze select test_function(5000); >>> QUERY PLAN >>> >>> ---------------------------------------------------------------------------------------- >>> Result (cost=0.00..0.26 rows=1 width=0) (actual time=473.318..473.318 >>> rows=1 loops=1) >>> Total runtime: 473.340 ms >>> (2 rows) >>> >>> pang=# explain analyze select test_function(15000); >>> QUERY PLAN >>> >>> ------------------------------------------------------------------------------------------ >>> Result (cost=0.00..0.26 rows=1 width=0) (actual >>> time=4044.903..4044.904 >>> rows=1 loops=1) >>> Total runtime: 4044.928 ms >>> (2 rows) >>> >>> pang=# explain analyze select test_function(50000); >>> QUERY PLAN >>> >>> -------------------------------------------------------------------------------------------- >>> Result (cost=0.00..0.26 rows=1 width=0) (actual >>> time=94994.337..94994.369 >>> rows=1 loops=1) >>> Total runtime: 94994.396 ms >>> (2 rows) >>> >>> I already tried to update to 8.3.23 service version but i didn't see any >>> improvement. >>> >>> Do you have any suggestion about how to increase the performance of >>> xmlconcat? >>> >>> My need is to use stored procedures that calls xmlconcat more than 50000 >>> times, but it is unacceptable 94 seconds to complete the job. >>> >>> Thanks in advance >> >> typically for high performance string manipulation you have to do >> things on more purely textual level and manipulate through arrays to >> get really good performance. iterative string concatenation is >> typically wrong approach -- you have to think in set terms. >> >> also your database version is obsolete -- time to start thinking about >> upgrade. >> >> merlin >> > Thank you for the reply Merlin but i don't fully get what you mean. (sorry, > i'm not a PostgreSQL expert) > How would you change the above example function in order to improve > performance? > What do you mean with "manipulate through arrays"? well arrays, or simple aggregation. for example: select string_agg(v, '') from (select 'aaaaaaa'::text as v from generate_series(1,50000)) q; runs in ~ 30 ms. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance