Re: Concatenate performance question

"Michael Guyver" <kenevel@xxxxxxxxxxxxxx> · Sun, 3 Dec 2006 13:52:06 +0000

Gents,

At risk of answering my own question to spur someone actually to share
their thoughts on this topic, I thought I'd provide a quick look at
the performance of the alternatives: either using the || operator, or
the array_append method.

-- SELECT * FROM test_v_01();
-- SELECT * FROM test_v_02();

CREATE OR REPLACE FUNCTION test_v_01() RETURNS VARCHAR AS $$
DECLARE
	buffer varchar;
	i int4;
BEGIN
	buffer := 'the quick brown fox jumps over the lazy dog';
	FOR i IN 1..1000 LOOP
		buffer := buffer || 'the quick brown fox jumps over the lazy dog';
	END LOOP;
	RETURN buffer;
END;
$$
LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION test_v_02() RETURNS VARCHAR AS $$
DECLARE
	buffer varchar[] := '{}';
BEGIN
	buffer := array_append(buffer,'the quick brown fox jumps over the lazy dog');
	FOR i IN 1..1000 LOOP
		buffer := array_append(buffer, 'the quick brown fox jumps over the lazy dog');
	END LOOP;
	RETURN array_to_string(buffer,'');
END;
$$
LANGUAGE plpgsql;

Running the array_append version is faster by at least one order of
magnitude in these examples. However, where you can in-line the ||
operator with multiple operands, ie

buffer := buffer || 'token 1' || results.user_id::text || 'token 2' ||
results.event_id::text || 'token3';

it is faster than calling

buffer := array_append(buffer, 'token 1');
buffer := array_append(buffer, results.user_id::text);
buffer := array_append(buffer, 'token 2');
buffer := array_append(buffer, results.event_id::text);
buffer := array_append(buffer, 'token 3');

This seems entirely reasonable, as the latter requires the evaluation
of five calls, whereas the former can do it in one go.

However, my original question still stands - is there another way of
doing this? Is it possible to write to a bytea or blob or stream and
avoid having to do any concatenation at all?

Cheers

Michael

On 29/11/06, Michael Guyver <kenevel@xxxxxxxxxxxxxx> wrote:
Hi there,

I've got a rather large PL/pgSQL function which returns a varchar
(though it could be text, char or blob, I'm not fussy) containing JSON
information (where JSON is Javascript Object Notation). The middle
tier of the app does pretty much sweet FA except pass this straight
back to the client. I'm interested in seeing how much faster I can get
the app to process a request this way as opposed to retrieving the
data over three or four calls to the DB before constructing the JSON
response in the middle tier.

I've got to the point where I suspect the concatenation could do with
some attention. What is the fastest way of returning this to the
client?

I thought that storing the individual segments of text in an array and
stitiching it all together at the end of the function may be a fast
way of doing things, using an

array_to_string(resultArray,'');

call. However I have my doubts whether the

resultArray := array_append(resultArray,'next token');

is performant as it looks as though it's constructing a new array from
the argument each time its called. Can someone confirm or rebut this?

How would a simple

result := result || 'next token';

perform? The result size is in the 20-25 Kb range.

A mate mentioned that the way Oracle's OWS does it is to stream the
response back as a blob. I presume he means that the function could
iterate over the different queries' result-sets and simply write the
results to the blob before returning. Can anyone shed any light on
this approach and its applicabilty to PostgreSQL?

Cheers

Michael