On 10/05/2015 05:02 AM, Begin Daniel wrote:
In order to process a large amount of data I need to run a procedure
using parallel batch processes.
The query I run in each process is expected to ...
1- select a bunch of id (500) in a table (wait4processing) containing
the list of all records to process
2- remove selected records from wait4processing table in order to
eliminate duplicate processing
3- run the procedure (build_contributions_clusters) over the range of
selected ids
--The query I use:
With ids as( delete from wait4processing where id in( select id from
wait4processing limit 500) returning id)
select build_contributions_clusters(min(id),max(id)) from ids;
The query runs properly if I send it sequentially (wait for the
completion of the query before sening it again) but it does'nt work
when sending multiple occurrences in parallel. Seems from the results I
got that the first query received by the server runs properly but the
following ones try to process the same first 500 records even if deleted
- the build_contributions_clusters procedure receive NULL values instead
of the following 500 records.
Since I am almost certain it is the expected behavior, I would like to
like to understand why, and I would also appreciate to
see alternative queries to do the job.
See here:
http://www.postgresql.org/docs/9.4/interactive/transaction-iso.html
"13.2.1. Read Committed Isolation Level
Read Committed is the default isolation level in PostgreSQL. When a
transaction uses this isolation level, a SELECT query (without a FOR
UPDATE/SHARE clause) sees only data committed before the query began; it
never sees either uncommitted data or changes committed during query
execution by concurrent transactions. In effect, a SELECT query sees a
snapshot of the database as of the instant the query begins to run.
However, SELECT does see the effects of previous updates executed within
its own transaction, even though they are not yet committed. Also note
that two successive SELECT commands can see different data, even though
they are within a single transaction, if other transactions commit
changes after the first SELECT starts and before the second SELECT starts."
Thanks :-)
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general