On Thu, Jul 19, 2012 at 11:07 AM, Jon Nelson <jnelson+pgsql@xxxxxxxxxxx> wrote: > Recently I found myself wondering what was taking a particular query so long. > I immediately assumed it was a lack of I/O, because lack of I/O is a > thorn in my side. > Nope, the I/O was boring. CPU? Well, the process was using 100% of the > CPU but the query itself was really very simple. > I turned to ltrace (horribly imprecise, I know). ltrace told me this: > > > % time seconds usecs/call calls function > ------ ----------- ----------- --------- -------------------- > 46.54 6.789433 69 97766 memcpy > 28.16 4.108324 1100 3732 strlen > 14.45 2.107567 564 3732 malloc > 9.16 1.336108 28 46877 memset > 0.74 0.107935 28 3732 strcpy > 0.73 0.107221 28 3732 free > 0.16 0.023687 187 126 write > 0.02 0.003587 28 126 __errno_location > 0.02 0.003075 59 52 read > 0.01 0.001523 29 52 memcmp > ------ ----------- ----------- --------- -------------------- > 100.00 14.588460 159927 total > > > and this: > > strlen("SRF multi-call context") > strcpy(0xe01d40, "SRF multi-call context") > malloc(1024) > memcpy(...) > memset(...) > ... > memset(...) > free(..) > > repeat. > > I was rather surprised to learn that (per-row): > (1) memcpy of 64 bytes accounted for 46% of the time spent in library calls > (2) the (other) costs of strlen, strcpy, malloc, and memset were so > huge (in particular, strlen) > > What, if anything, can be done about this? It seems the overhead for > setting up the memory context for the SRF is pretty high. > I notice this overhead pretty much every time I use any of the array > functions like unnest. > > Please help me to understand if I'm misinterpreting things here. > > [x86_64, Linux, PostgreSQL 9.1.4] A followup. Recently, I imported a bunch of data. The import ran in about 30 seconds. The data itself was represented in a way that made more sense - from a relational database perspective - as multiple tables. To accomplish this, I made use of string_to_array and unnest. The initial table creation and copy run in about 30 seconds, but then the creation of the new table (create table ... as select .. unnest(string_to_array(....))) took over 5 minutes. 10 times as long. What is it about the array functions (actually, all set-returning functions that I've tried) that causes them to be so expensive? The per-call overhead is enormous in some cases. PostgreSQL 9.1.5 on x86_64 (openSUSE 12.2 - but the effect has been observed across several platforms and major/minor releases of PostgreSQL). > -- > Jon -- Jon -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance