On Wed, Aug 8, 2012 at 2:41 PM, Geert Mak <pobox@xxxxxxxxxxxxx> wrote: > hello everybody, > > we are trying to move the data from table1 into table2 using a plpgsql stored procedure which is performing simple a data conversion > > there are about 50 million rows > > the tables are relatively simple, less than a dozen columns, most are integer, a couple are char(32) and one is varchar holding URLs > > what happens is that when we execute the stored procedure, the execution eats up the 17 GB free space on the server and the server crashes > > if we apply a limit of 1 million rows, the execution completes successfully in about a minute > > we understand, by now, that plpgsql functions are executed in their own transaction, which commits when the function comes to an end > > the question is - > > are plpgsql functions suitable for such massive operations on data and can this work without so much disk space is being eaten for something which should be simply "read-change-write, read-change-write, read-change-write, …"? i. e. any way to force commit inside, or so? > > or should we rather implement this operation in some external scripting language (speed is not that important, this is a one time conversion) What is the general structure of the procedure? In particular, how are you browsing and updating the rows? There is (almost) no way to force commit inside a function -- there has been some discussion about stored procedure and/or autonomous transaction feature in terms of getting there. I say 'almost' because you can emulate some aspects of autonomous transactions with dblink, but that may not be a very good fit for your particular case. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general