rafael.domiciano@xxxxxxxxx (Rafael Domiciano) writes: > Hello Kevin, Thnks for response, > Doing the alter table to add the new column was fast: ALTER TABLE table1 ADD COLUMN new_column date; > The problem is that I have to do a update in this column, and the values are going to be the a misc of others 2 columns of the table1, something > like this: > update table1 > set new_column = (date) > where > new_column is null; > Postgres Version: 8.3.6 > Os.: Fedora Core 9 > 4 Gb Ram If you try to do this in one "swell foop," it's going to take hours, lock anything else that would want to access the table, and bloat the table, all of which is exactly what you don't want... I'd suggest doing the updates in more bite-sized pieces, a few thousand tuples at a time. Further, to make that efficient, I'd suggest adding an index, at least temporarily, on some column in the table that's generally unique. (A date stamp that *tends* to vary would be plenty good enough; it doesn't need to be strictly unique. What's important is that there shouldn't be many repeated values in the column.) Thus, the initial set of changes would be done thus... alter table1 add column new_column timestamptz; create index concurrently temp_newcol_idx on table1(quasi_unique_column) where (new_column is null); It'll take a while for that index to be available, but it's not really necessary to use it until you have a lot of tuples converted to have new_column set. Then, run a query like the following: update table1 set new_column = [whatever calculation] where new_column is null and quasi_unique_column in (select quasi_unique_column from table1 where new_column is null limit 1000); This should be repeated until it no longer finds any tuples to "fix." Once this is complete, the temporary index may be dropped. The number 1000 is somewhat arbitrary: - 1 would be bad, as that means you need to do 8 million queries to process an 8M tuple table - 8000000 would be bad, as that would try to do the whole thing in one big batch, taking a long time, locking things, bloating things, and consuming a lot of memory 1000 is materially larger than 1, but also materially smaller than 8000000. Using 10000, instead, would mean more work is done in each transaction; you might want to try varying counts, and stop increasing the count when you cease to see improvements due to doing more work in bulk. I doubt that there's a material difference between 1000 and 10000. Make sure that the table is being vacuumed once in a while; that doesn't need to be continuous, but if you want the table to only bloat by ~10%, then that means you should vacuum once for every 10% of the table. -- (reverse (concatenate 'string "gro.mca" "@" "enworbbc")) http://www3.sympatico.ca/cbbrowne/advocacy.html Mary had a little lambda A sheep she couldn't clone And every where that lambda went Her calculus got blown -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin