Search Postgresql Archives

Re: temp tables and function performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Anton Melser wrote:
> Hi,
> I am trying to move up in the world with my sql and need to do the
> following...
> I have a subscribers table and I need to export to csv (semi-colon
> separated) certain fields - that is fine, but I also need to export a
> multi-select field from another table as one string (0 to n values
> separated by commas) per line. The problem being that to get the
> actual string I have to go via 4 other relations... and as I have
> 200k+ subscribers this takes a while.

200k isn't all that much, unless you have a lot of large columns.

> My idea (which seems to work, though I haven't tested fully as it
> takes too damn long!), was to do the following. I would REALLY
> appreciate any pointers as my sql has never been this challenged!
> 
> CREATE OR REPLACE FUNCTION mytest()
>  RETURNS integer AS
> $BODY$DECLARE kindy INTEGER;
> BEGIN


> create temporary table tmp_interests(
> id bigint,
> value character varying(100)
> )
> WITHOUT OIDS
> ON COMMIT DROP;
> 
> insert into tmp_interests
> select distinct si.subid, rbi.value
> from restem rbi, cats cc, trm_terms tt, subrest si
> where rbi.key = cc.name
>     and cc.catid = tt.modcid
>     and tt.tid = si.themeid;

You can do these in one go using
 CREATE TEMPORARY TABLE tmp_interests AS SELECT ...

> create temporary table tmp_subscribers(
> email character varying(200),
> format character varying(4),
> interests  character varying(1000),
> )
> WITHOUT OIDS
> ON COMMIT DROP;
> 
> insert into tmp_subscribers
> Select email,
> format,
> my_interests(id) as interests
> from subscriber;
> 
> GET DIAGNOSTICS kindy = ROW_COUNT;
> 
> copy tmp_subscribers to '/home/myname/subs.csv' WITH CSV DELIMITER AS
> ';' NULL AS '';

If you have PG8.2 and can combine your 2 select queries into one, then
you can create a view of them and copy that instead. It takes out all
the inserts and can use your already existing table statistics - it
should be faster.

Also, after inserting a bunch of records into a table, make a habit of
running ANALYSE on it. Otherwise the query-planner knows nothing about
the data in the tables and is likely to come up with a sub-optimal query
plan.

> 
> GET DIAGNOSTICS kindy = ROW_COUNT;
> 
> return kindy;
> 
> END;$BODY$
>  LANGUAGE 'plpgsql' VOLATILE;
> 
> ...
> 
> 
> CREATE OR REPLACE FUNCTION my_interests(bigint)
>  RETURNS character varying AS
> $BODY$DECLARE
> subid ALIAS FOR $1;
> interests character varying;
> myinterest RECORD;
> 
> BEGIN
>     interests := '';
>     FOR myinterest IN execute 'select value from tmp_interests where id =
> ' || subid LOOP

No need for a dynamic query here...

>         if interests = '' then
>             interests := myinterest.value;
>         else
>             interests := interests || ',' || myinterest.value;
>         end if;
>     END LOOP;
> 
>     RETURN interests;
> END$BODY$
>  LANGUAGE 'plpgsql' VOLATILE;

I'd have to look up the syntax, but I'm quite certain you can put the
results of a select into an array. After that you can call
array_to_string(...) to convert it into a comma seperated string. That'd
take away the need for this SP (which I think is actually STABLE instead
of VOLATILE).

-- 
Alban Hertroys
alban@xxxxxxxxxxxxxxxxx

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux