On 04/27/2011 03:35 PM, Mark Stosberg wrote:
In particular, I wanted to check whether the UPDATE statement would alter all the rows automatically, or if the underlying trigger would cause all the rows processed a row at a time. It appears from my test that the result of the UPDATE was going to appear all at once. I'm worried about the resource implications of inserting mullions of rows all at once.
You can put a LIMIT on an UPDATE, same as any other type of query. No reason that style of migration must happen all at once, you can just target a smaller number of rows at a time and slowly siphon rows over to the children by iteration.
I don't see anything inherently wrong with the approach you're proposing. CREATE INDEX CONCURRENTLY does something similar to the logic you've outlined--mark where data was inserted at, move over all older data, then just copy over the new rows at the end. The main downsides of that is complexity and the need for downtime to get an atomic swap to using the child tables. You can't add them to the parent until the original is gone, if you've inserted duplicate data into them.
I'm not sure what all that complexity buys you, compared to just adding all the children, putting a limit on the UPDATE, and looping over that with some delay after each iteration until it's finished if you want to further control the rate. There's usually no reason you have to be in a rush to moving data over. Relying on the database's transactional scheme to avoid making any mistakes here--making it so a goof will ROLLBACK--and avoiding any need for downtime are normally higher priorities in a partition migration than making the move happen as fast as possible.
-- Greg Smith 2ndQuadrant US greg@xxxxxxxxxxxxxxx Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin