On Thu, Mar 10, 2011 at 9:40 AM, fork <forkandwait@xxxxxxxxx> wrote: > Given that doing a massive UPDATE SET foo = bar || ' ' || baz; on a 12 million > row table (with about 100 columns -- the US Census PUMS for the 2005-2009 ACS) > is never going to be that fast, what should one do to make it faster? > > I set work_mem to 2048MB, but it currently is only using a little bit of memory > and CPU. (3% and 15% according to top; on a SELECT DISTINCT ... LIMIT earlier, > it was using 70% of the memory). > > The data is not particularly sensitive; if something happened and it rolled > back, that wouldnt be the end of the world. So I don't know if I can use > "dangerous" setting for WAL checkpoints etc. There are also aren't a lot of > concurrent hits on the DB, though a few. > > I am loathe to create a new table from a select, since the indexes themselves > take a really long time to build. you are aware that updating the field for the entire table, especially if there is an index on it (or any field being updated), will cause all your indexes to be rebuilt anyways? when you update a record, it gets a new position in the table, and a new index entry with that position. insert/select to temp, + truncate + insert/select back is usually going to be faster and will save you the reindex/cluster. otoh, if you have foreign keys it can be a headache. > As the title alludes, I will also be doing GROUP BY's on the data, and would > love to speed these up, mostly just for my own impatience... need to see the query here to see if you can make them go faster. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance