Eileen wrote: > I have written some Java code which builds a postgresql function. That function calls approximately 6 > INSERT statements with a RETURNING clause. I recreate and re-run the function about 900,000 times. I > use JDBC to execute these functions on postgresql 8.3 on Windows. When I tried running this on a > single Connection of Postgresql, it failed (some kind of memory error). So I split the JDBC > connections up into chunks of 5000. I reran and everything was fine. It took about 1 hour to execute > all the updates. > > Since it took so long to perform the update, I wanted to prevent other users from querying the data > during that time. So I read about the LOCK command. It seemed like I should LOCK all the tables in > the database with an ACCESS EXCLUSIVE mode. That would prevent anyone from getting data while the > database was making its updates. > > Since a LOCK is only valid for 1 transaction, I set autocommit to FALSE. I also removed the code > which chunked up the inserts. I had read that a single transaction ought to have better performance > than committing after each insert, but that was clearly not what ended up happening in my case. > > In my case, a few problems occurred. Number 1, the process ran at least 8 hours and never finished. > It did not finish because the hard drive was filled up. After running a manual vacuum (VACUUM FULL), > no space was freed up. I think this has cost me 20 GB of space. Is there any way to free this space > up? I even dropped the database to no avail. Try to identify what files use the space. Look at the size of directories. Could it be that "archive_mode" is "on" and you ran out of space for archived WALs? When you drop a database, all files that belong to the database are gone. > Secondly, why did this process take over 8 hours to run? While reading the performance mailing list, > it seems like recommendations are to run lots of INSERTS in a single commit. Is 5 million too many? > Is redefining a function over and over inside a transaction a problem? Does the RETURNING clause > present a problem during a single transaction? It would be interesting to know how the time was spent. Were the CPUs busy? Were there locks? Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance