I have a question about using pg_upgrade on a very large database. It is not feasible to copy the data location, so I am looking at the link option (pg_upgrade -k). That is, for the same reason it is not practical to run pg_dumpall / restore it is not practical to copy several tens of terabytes of data files during each migration. My two very small tests have failed, no doubt due to my own lack of understanding. I did not have 8.x software on my test computer, so I was testing pg_upgrade using 9.0.2 and 9.0.3. Goal: in place upgrade of a cluster named test. Built a clean system with CentOS 5.4 x86_64 and 8 GB RAM (it is running in a VirtualBox VM under Windows 7). Installed 9.0.2 (PGHOME=/pghome/9.0.2). Created a cluster with a few million rows of sample data (PGDATA=/pgdata1/test). Just one regular table with one column of type numeric populated with numbers from 1 to 1,000,000. No indexes or other objects. pg_ctl stop Installed 9.0.3 (PGHOME=/pghome/9.0.3) including the contrib modules for pg_upgrade and pg_upgrade_support. su - postgres and try to run initdb per documentation, but failed since the data dir is already populated. Should the documentation be modified to note initdb is not to be run when using pg_upgrade in link mode? Ran "pg_upgrade -k" with additional flags for the old and new bin dirs, etc. as shown below. Through trial and error I found I must unset certain env. variables. I am setting a number of other variables simply to shorten what must be typed on the command line for the pg_upgrade parameters.
It errors out saying the cluster already has files, which of course is expected since I am using the -k flag. Aside from the fact that I do not need to use pg_upgrade to do a minor release and this is purely an example, what am I doing wrong? Am I misunderstanding the meaning of the link option? I assume for upgrade in place the old and new cluster are the same thing, just the binaries are different. I also assume env. variables are allowed and you are not opening other windows/sessions. -Mark
|