Hot migration of tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So I have a data warehouse type of postgresql server. Each day a new db is
created and new data is inserted into the new db (around 20G). The data is
fairly simple in structure, around 100 tables and some of the table is
pretty big with millions of rows. Each table has only a primary key, no
foreign key or other constrain at all. The db is never changed once the
day's insertion is done. 

I need an efficient way to migrate the db from SSD to HDD when the DB
becomes "cold" (say 1 week later, fewer people are interested in the content
any more) for sake of saving storage expenses. I want my tables still online
(readable) when the copy is in process. In theory this should work since
copying does not modify the tables nor do we modify the table at all. Only
at the moment when some metadata (such as pg_class table, etc) is updated
should the db block any reading query.

I have tried many solutions, like pg_dump and pg_restore, pg_repack, etc.
None of them achieves high throughput since I need a File System level copy
instead of re-insertion type of solution. 

An easy way to do so is to simply make a new tablespace pointing at the HDD,
and change the tablespace. However, the problem of ALTER TABLE SET
TABLESPACE is that it takes an AccessExclusiveLock to perform, which makes
my db offline for quite a while (GBs of data migration takes some time). So
I decide to change modify the source code of Postgresql 9.6. I changed the
lockmode of ALTER TABLE SET TABLESPACE to "ExclusiveLock" instead of
"AccessExclusiveLock". right after the data copy I acquire an
"AccessExclusiveLock", which persists until the command finishes.
 
I added
	*relation_close(rel, NoLock);
	rel = relation_open(tableOid, AccessExclusiveLock);*
at tablecmds.c:9703
(http://doxygen.postgresql.org/tablecmds_8c_source.html#l09703)

It seems working. But PostgreSQL is a very complex system. I know changing
the lockmode may impact other subcommands, but let's forget such issues at
this moment. I will probably make an extension and do the migration instead
of changing the source code if I am sure it does not break PostgreSQL.
Still, I am not sure my modification will introduce any other problem that I
just haven't encounter yet. So the question becomes: will changing
AccessExclusiveLock to ExclusiveLock before updating metadata introduce any
problem? Some kind of deadlock maybe? 

To my understanding it shouldn't since the only difference is
AccessExclusiveLock does not allow read. However, allowing read when the db
is still readable should not destroy other mechanisms. Am I correct?

Thanks in advance.









--
View this message in context: http://postgresql.nabble.com/Hot-migration-of-tables-tp5928902.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux