Search Postgresql Archives

Re: efficiently migrating 'old' data from one table to another

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "Jonathan Vanasco" <postgres@xxxxxxxx>
> To: "pgsql-general general" <pgsql-general@xxxxxxxxxxxxxx>
> Sent: Thursday, January 12, 2017 3:06:14 PM
> Subject:  efficiently migrating 'old' data from one table to another
> 
> I'm just wondering if there's a more efficient way of handling a certain
> periodic data migration.
> 
> We have a pair of tables with this structure:
> 
> 	table_a__live
> 		column_1 INT
> 		column_2 INT
> 		record_timestamp TIMESTAMP
> 
> 	table_a__archive
> 		column_1 INT
> 		column_2 INT
> 		record_timestamp TIMESTAMP
> 
> periodically, we must migrate items that are 'stale' from `table_a__live ` to
> `table_a__archive`.  The entries are copied over to the archive, then
> deleted.
> 
> The staleness is calculated based on age--  so we need to use INTERVAL.  the
> "live" table can have anywhere from 100k to 20MM records.
> 
> the primary key on `table_a__live` is a composite of column_1 & column_2,
> 
> In order to minimize scanning the table, we opted to hint migrations with a
> dedicated column:
> 
> 	ALTER TABLE table_a__live ADD is_migrate BOOLEAN DEFAULT NULL;
> 	CREATE INDEX idx_table_a__live_migrate ON table_a__live(is_migrate) WHERE
> 	is_migrate IS NOT NULL;
> 
> so our migration is then based on that `is_migrate` column:
> 
> 	BEGIN;
> 	UPDATE table_a__live SET is_migrate = TRUE WHERE record_timestamp <
> 	transaction_timestamp() AT TIME ZONE 'UTC' - INTERVAL '1 month';
> 	INSERT INTO table_a__archive (column_1, column_2, record_timestamp) SELECT
> 	column_1, column_2, record_timestamp FROM table_a__live WHERE is_migrate IS
> 	TRUE;
> 	DELETE FROM table_a__live WHERE is_migrate IS TRUE;
> 	COMMIT;
> 
> The inserts & deletes are blazing fast, but the UPDATE is a bit slow from
> postgres re-writing all the rows.
> 
> can anyone suggest a better approach?
> 
> I considered copying everything to a tmp table then inserting/deleting based
> on that table -- but there's a lot of disk-io on that approach too.


Review manual section 7.8.2. Data-Modifying Statements in WITH


https://www.postgresql.org/docs/9.6/static/queries-with.html


-- B


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux