Thanks for such quick and helpful answers! My plan sounds probably better to avoid, but if it turns out to be necessary, you all gave me some helpful avenues and things to look out for.
Best,
Jack
On Tue, Aug 14, 2018 at 1:06 PM, Stephen Frost <sfrost@xxxxxxxxxxx> wrote:
Greetings,
* Jack Cushman (jcushman@xxxxxxxxx) wrote:
> I have a large database of text, with a 600GB table and a 100GB table
> connected by a join table. They both see occasional updates throughout the
> week. Once a week I want to "cut a release," meaning I will clone just the
> 100GB table and copy it to a "release" server for read-only web access.
My general recommendation to people who are thinking about something
like this is to use their restore-tests as a way to stage things (you
are testing your backups by doing a restore, right?) and then copy over
the results.
> My procedure would be:
>
> - keep the 600GB table on a separate tablespace
> - cleanly stop postgres on both servers
> - copy the data folder to the release server
> - delete pg_tblspc/* on the release server
> - start postgres on both servers
So, instead of that procedure, it'd be:
- Back up the database as per usual
- Restore the database somewhere
- Run some sanity checks on the restored database
- go in and drop the table and sanitize anything else necessary
- Shut down the database and copy it into place
- OR take a new backup of the sanitized database and then restore it
into place
Much cleaner, and tests your backup/restore process.
Alternatively, you could just track changes to the "main" database using
triggers into an audit log and then replay the changes made to the 100GB
table into the other database.
> In local testing this seems to work -- the release server works fine, and I
> only get an error message if I try to access the missing tables, which is
> expected. But are there reasons this is going to bite me if I try it in
> production? I'm hoping it helps that (a) I'm only doing read access, (b) I
> can cleanly stop both servers before cutting a release, and (c) I'm not
> worried about losing data, since it's just an access copy.
Still, it's a hacked up and not entirely proper PG database which will
likely lead to confusion- maybe you won't be confused, but I strongly
suspect others looking at it will be, and you might run into other
issues too along the lines of what Tom mentioned (background jobs
failing and such).
Thanks!
Stephen