> From: Joe Conway > Sent: Sunday, December 8, 2019 9:04 PM > To: Lars Aksel Opsahl; Laurenz Albe; pgsql-performance@lists.postgresql.org > Subject: Re: How to run in parallel in Postgres, EXECUTE_PARALLEL > > On 12/8/19 1:14 PM, Lars Aksel Opsahl wrote: > > Do you or anybody know if there are any plans for a function call that > > support the calling structure below or something like it and that then > > could finish in 1 second ? (If you are calling a void function, the > > return value should not be any problem.) > > > > DO > > $body$ > > *DECLARE* > > command_string_list text[3]; > > *BEGIN* > > command_string_list[0] = 'SELECT pg_sleep(1)'; > > command_string_list[1] = 'SELECT pg_sleep(1)'; > > command_string_list[2] = 'SELECT pg_sleep(1)'; > > EXECUTE_PARALLEL command_string_list; > > *END* > > $body$; > > > > The only way to this today as I understand it, is to open 3 new > > connections back to the database which you can be done in different ways. > > Yes, correct. > > > If we had a parallel functions like the one above it's easier to > > make parallel sql without using complex scripts, java, python or other > > system. > > It does require one connection per statement, but with dblink it is not > necessarily all that complex. For example (granted, this could use more > error checking, etc.): > > 8<---------------- > CREATE OR REPLACE FUNCTION > execute_parallel(stmts text[]) > RETURNS text AS > $$ > declare > i int; > retv text; > conn text; > connstr text; > rv int; > db text := current_database(); > begin > for i in 1..array_length(stmts,1) loop > conn := 'conn' || i::text; > connstr := 'dbname=' || db; > perform dblink_connect(conn, connstr); > rv := dblink_send_query(conn, stmts[i]); > end loop; > for i in 1..array_length(stmts,1) loop > conn := 'conn' || i::text; > select val into retv > from dblink_get_result(conn) as d(val text); > end loop; > for i in 1..array_length(stmts,1) loop > conn := 'conn' || i::text; > perform dblink_disconnect(conn); > end loop; > return 'OK'; > end; > $$ language plpgsql; > 8<---------------- > > And then: > > 8<---------------- > \timing > DO $$ > declare > stmts text[]; > begin > stmts[1] = 'select pg_sleep(1)'; > stmts[2] = 'select pg_sleep(1)'; > stmts[3] = 'select pg_sleep(1)'; > PERFORM execute_parallel(stmts); > end; > $$ LANGUAGE plpgsql; > DO > Time: 1010.831 ms (00:01.011) > 8<---------------- > > HTH, > > Joe > -- > Crunchy Data - http://crunchydata.com > PostgreSQL Support for Secure Enterprises > Consulting, Training, & Open Source Development Hi
Thanks a lot it works like a charm. https://github.com/larsop/find-overlap-and-gap/tree/use_dblink_for_parallel
(The test is failing now because it seems like drop EXTENSION dblink; is not cleaning up every thing)
As you say we need some error handling. And maybe some retry if not enough free connections and a parameter
for max parallel connections and so on.
So far this is best solution I have seen.
Thanks.
Lars
|