Search Postgresql Archives

Re: ZFS filesystem - supported ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greetings,

* Lucas (root@xxxxxxx) wrote:
> > On 27/10/2021, at 8:35 AM, Stephen Frost <sfrost@xxxxxxxxxxx> wrote:
> > I do want to again stress that I don't recommend writing your own tools
> > for doing backup/restore/PITR and I would caution people against people
> > trying to use this approach you've suggested.  Also, being able to tell
> > when such a process *doesn't* work is non-trivial (look at how long it
> > took us to discover the issues around fsync..), so saying that it seems
> > to have worked for a long time for you isn't really enough to make me
> > feel comfortable with it.
> 
> I think it is worth mentioning that we’re using PG 9.2, so I don’t benefit from features like logical replication to help with backups and such.
> (Yes, I know it’s old and that I should upgrade ASAP. We’re working on it)

You'll certainly want to upgrade, of course.  That said, logical
replication isn't for backups, so not sure how that's relevant here.

> The snapshots are done this way:
> 1. Grab the latest applied WAL File for further references, stores that in a variable in Bash
> 2. Stop the Postgres process

How is PostgreSQL stopped here..?

> 3. Check it is stopped
> 4. Start the Block level EBS Snapshot process
> 5. Applied tags to the snapshot, such as the WAL file from Step #1
> 6. Wait for snapshot to complete, querying the AWS API for that
> 7. Start PG
> 8. Check it is started
> 9. Check it is replicating from master

This doesn't explain how the snapshot is restored.  That's a huge part
of making sure that any process is implemented properly.

> The entire process takes around 10 minutes to complete.

Having the database offline for 10 minutes is a luxury that many don't
have; I'm a bit surprised that it's not an issue here, but if it isn't,
then that's great.

As noted down-thread though, if you stop PostgreSQL cleanly in step #2
(that is, either 'fast' or 'smart' shutdown modes are used) then you
don't need to do anything else, a 'shutdown' checkpoint will have been
written into the WAL that is included in the snapshot and everything
should be good.  Note that this presumes that the snapshot above
includes the entire PG data directory, all tablespaces, and the WAL.  If
that's not the case then more discussion would be needed.

> I copy the WAL files to S3 in case I ever need to restore an old snapshot (we keep snapshots for 30 days), which won’t have the WAL files in the volume itself. Also for safety reasons, after reading this <https://about.gitlab.com/blog/2017/02/10/postmortem-of-database-outage-of-january-31/> article many many times.

If the database has been cleanly shut down and the pg_wal (pg_xlog)
directory of the running database is included in that snapshot then you
don't need any other WAL in order to be able to restore an individual
snapshot.  If you want to be able to perform PITR
(Point-In-Time-Recovery, which you can play forward through the WAL to
any point in time that you want to restore to without having to pick one
of the snapshots) then you would want to archive *all* of your WAL using
archive_command and a tool like pgbackrest.

> We’re working on a migration to PG 14 so, please, feel free to suggest another best way to achieve this, but it would be better in a new thread I reckon.

It isn't terribly clear what you're actually looking for in terms of
capability and so it's kind of hard to suggest something to "achieve
this".  Having snapshots taken while the database is down and then
talking about copying WAL files to s3 to restore those snapshots doesn't
strike me as making much sense though and I'd worry about other
misunderstandings around how PG writes data and how WAL is handled and
how to properly perform PG backups (online and offline) and restores.
I'm afraid that this is a quite complicated area and reading blog posts
about database outage postmortems (or even our own documentation,
frankly) is rarely enough to be able to become an expert in writing
tools for that purpose.

Thanks,

Stephen

Attachment: signature.asc
Description: PGP signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux