Search Postgresql Archives

Re: creating a subset DB efficiently ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 8, 2024 at 4:22 PM David Gauthier <dfgpostgres@xxxxxxxxx> wrote:
Here's the situation....

- The DB contains data for several projects.  
- The tables of the DB contain data for all projects (data is not partitioned on project name or anything like that)
- The "project" identifier (table column) exists in a few "parent" tables with many child... grandchild,... tables under them connected with foreign keys defined with "on delete cascade".  So if a record in one of the parent table records is deleted, all of its underlying, dependent records get deleted too.
- New projects come in, and old ones need to be removed and "archived" in DBs of their own.  So there's a DB called "active_projects" and there's a DB called "project_a_archive" (identical metadata).
- The idea is to copy the data for project "a" that's in "active_projects" to the "project_a_arhchive" DB AND delete the project a data out of "active_projects".
- Leave "project_a_archive" up and running if someone needs to attach to that and get some old/archived data.  

The brute-force method I've been using is...
1)  pg_dump "active_projects" to a (huge) file then populate "project_a_archive" using that (I don't have the privs to create database, IT creates an empty one for me, so this is how I do it).  
2) go into the "project_a_archive" DB and run... "delete from par_tbl_1 where project <> 'a' ", "delete from par_tbl_2 where project <> 'a' ", etc... leaving only project "a" data in the DB.  
3) go into the "active_projects" DB and "delete from par_tbl_1 where project = 'a' ", etc... removing project "a" from the "active_projects DB.

Ya, not very elegant, it takes a long time and it takes a lot of resources.  So I'm looking for ideas on how to do this better.

Related question...
The "delete from par_tbl_a where project <> 'a' " is taking forever.  I fear it's because it's trying to journal everything in case I want to rollback.  But this is just in the archive DB and I don't mind taking the risk if I can speed this up outside of a transaction.  How can I run a delete command like this without the rollback recovery overhead ?     


>(I don't have the privs to create database, IT creates an empty one for me, so this is how I do it).   

That's a shame.  You can do something similar with tablespaces
  Template your existing schema to create a new schema for the project (pg_dump -s)
  Create tablespace for this new project and schema

 You can then move the physical tablespace to cheaper disk and use symbolic links or... archive and/or back it up at the schema level with pg_dump -n

...as long as you don't put anything in the public schema all you are really sharing is roles otherwise a bit like a separate database





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux