Greetings, * Dylan Luong (Dylan.Luong@xxxxxxxxxxxx) wrote: > We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode. > Ie, > psql -c "SELECT pg_start_backup('${DATE}');" > tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME} > psql -c "SELECT pg_stop_backup();" This doesn't do anything to ensure that the backup files were actually written out to disk, meaning that you might lose files or portions of files if the system crashed shortly after this completed. You don't mention anything about archive_command or how you are handling WAL? Note that you must have the WAL generated between the start backup and the stop backup for the backup to be valid. If you do have an archive command and it looks anything like what you have for your backup command, you likely have the same issue there where the WAL isn't being synced out to disk and therefore you might lose some WAL on a crash and not be able to do PITR or possibly utilize a given backup. You also don't meniton anything about checking that you have all of the WAL needed between the start/stop before considering the backup valid. I'd strongly suggest you look into a designed-for-PG backup solution instead of trying to roll your own and trying to fix all of these issues. There's a few of them out there that I'd suggest considering- pgBackRest being favorite (but I'm biased), barman, or wal-e (maybe wal-g). > The size of our database is about 250GB and it usually takes about 1 hour to backup. > During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to 3 seconds. > During this time (1:30am) usage is low (less than 10 users) on the system. > > Has anyone experience the same problem and any suggestions where to look at to resolve the problem? Some of the backup tools for PG support throttling to slow down the backup, but my recommendation here would be to stand up a replica and have the bulk of the data pulled from the replica for a given backup instead of doing it from the primary. As mentioned down-thread, you're likely saturating either your i/o bandwidth or your network bandwidth (if you're going to an NFS mount or similar). Thanks! Stephen
Attachment:
signature.asc
Description: PGP signature