Re: Re: best practice for moving millions of rows to child table when setting up partitioning?

Greg Smith <greg@xxxxxxxxxxxxxxx> · Mon, 02 May 2011 00:01:59 -0400

On 04/27/2011 03:35 PM, Mark Stosberg wrote:
In particular, I wanted to check whether the UPDATE statement would
alter all the rows automatically, or if the underlying trigger would
cause all the rows processed a row at a time.

It appears from my test that the result of the UPDATE was going to
appear all at once. I'm worried about the resource implications of
inserting mullions of rows all at once.

You can put a LIMIT on an UPDATE, same as any other type of query.  No 
reason that style of migration must happen all at once, you can just 
target a smaller number of rows at a time and slowly siphon rows over to 
the children by iteration.

I don't see anything inherently wrong with the approach you're 
proposing.  CREATE INDEX CONCURRENTLY does something similar to the 
logic you've outlined--mark where data was inserted at, move over all 
older data, then just copy over the new rows at the end.  The main 
downsides of that is complexity and the need for downtime to get an 
atomic swap to using the child tables.  You can't add them to the parent 
until the original is gone, if you've inserted duplicate data into them.

I'm not sure what all that complexity buys you, compared to just adding 
all the children, putting a limit on the UPDATE, and looping over that 
with some delay after each iteration until it's finished if you want to 
further control the rate.  There's usually no reason you have to be in a 
rush to moving data over.  Relying on the database's transactional 
scheme to avoid making any mistakes here--making it so a goof will 
ROLLBACK--and avoiding any need for downtime are normally higher 
priorities in a partition migration than making the move happen as fast 
as possible.

--
Greg Smith   2ndQuadrant US    greg@xxxxxxxxxxxxxxx   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin