HI Daniel: On Fri, May 15, 2015 at 5:35 PM, Daniel Begin <jfd553@xxxxxxxxxxx> wrote: > Bonjour Francisco. Buenos dias. > Skimming the documentation sequentially is a cleaver advice, especially since the doc is much of the time well done and exhaustive. Unfortunately, even if I actually did it about 1 year ago, it seems this specific item slipped out of my mind :-( Bad luck, you managed to get one of the most important commands, but it's solved now. > About dump/restore operation, restoring the database cluster is running for 24hrs now (psql -f pgdumpallOutputfile postgres). Since it took 13hrs to dump the cluster, I begin to wonder how long it is going to take to restore it... Much longer than this, especially if as you say below you have a lot of indexes. It's one of the reasons many of us do not use pg_dumpall for anything but global objects, but use something like the crontab which John R. Pierce posted ( I use a similar thing, but with an intermediate script with dumps critical databases more frequently, skips recreatable ( may be someone can confirm if that word is right ? I mean test things which can be created from scratch ( as they come from a script ) ) databases and keeps several numbered copies ). Doing it this ways insures we can restore on criticality order if it needs to be done ( I even move unmodified partitions to a 'historic" schema, which gets dumped only after a change, which cuts my backups times to a tenth ) One thing. I do not know how you are restoring the database, but when doing this things we use a specially tuned postgresql.conf ( fsync off, minimal loging, lots of worrk mems and similar things, as we do not care about durability ( you can just rerun initdb and redo the restore, and there is only 1 session connected, the restoring one ). This cuts the restore times to easily a tenth, then after ending it we restart the server with the normal cong. It is a must when doing this short of things. > My main concern is about how the indexes are managed in dump/restore operations. I understand that pg_dumpall actually uses pg_dump where the doc says "Post-data items include definitions of indexes, triggers..." I would not worry if the doc said that indexes are simply copied but it says "includes definition of indexes". > Since some of the indexes took days to build... does someone could confirm indexes are rebuilt instead of copied? > If indexes are actually rebuilt, why should it be done that way? - There must be good reason! You are out of luck, and it has a reason. First, pg_dumps does not copy, it dumps. It's simpler behaviour ( text output ) just output a SQL script which recreates everything and inserts all the data ( normally using copy for speed, but it's the same as inserting ). It takes care of generating a fast script ( meaning it creates the tables, then inserts the data, then creates indexes and reactivates constraints, which is faster than defining everything and inserting with indexes and constraints actives ). The reason to do it in text mode is you can dump between different version and/or architectures, and also the dump is much smaller than the db, specially if you compress it ( I always do it, testing a bit you can always find a compresor with will lead to faster backups, as saved disk writing easily offsets compression times, specially in moder multicpu memory rich machines ). Bear in mind in many scenarios you backup a lot ( we dump some critical things hourly, even if we are using replication ) and restore nearly never, and prefer to use a couple days more for the restore than a couple hours of degraded performance every backup. This being said, if you have an 820G db ( I still do not know which size is this, I suppose it's $PGDATA footprint ) of important data ( it does not seem critical in availability, as you are taking days and still in bussiness ) and you are having these kind of problems to dump and restore and move directories in your OS, and do not know how much time it takes for backups, you have a problem. You should practice backup AND restore more, because your question indicates you MAY be backing up your data, but you have never restored a backup. Also, the text output format is really good for the global objects in pg_dumpall, but not so much for the normal databases. For this you should use the custom format, unless it is a really small db. The problem with it is it can only do a database per file, and needs pg_restore to be read ( I know those are minors ). The advantage is instead of generating a plain text dump it builds a kind of tar file with the definitions and data for every object clearly separated, so you can do partial restores or whatever thing you want ( in fact, without options and without connecting to the database pg_restore spits out the same text file that a text dump will generate ). If you had used this technique you could have restored your tables in order, or restored only the data and then reindexed them concurrently with some other ( performance degraded ) work. You can do the same thing by editing the text dump, but it gets impractical and really complex beyond a few megabytes. regards. Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general