Curt Moore wrote:
I also had Slony in this mix, which resulted in a few more steps. From
what I've read on the wiki pages regarding the current Fedora
infrastructure setup, there is only 1 DB server, a SPOF. It may be
worth the time to start a separate conversation about some sort of
replication or clustered setup, which a beast unto itself.
Correct, there is one DB server at this time (making upgrades even more
fun!). I think a conversation about a replicated or clustered setup
would be good. At least getting some ideas thrown out there on the table.
Now, onto the upgrade process.
*** Make a DB and filesystem backup ***
Ensure that all necessary config files are archived so they can be
quickly reinstalled via kickstart later
1) Setup all apps to connect to DB using a host alias in /etc/hosts
2) Setup another server as a temporary DB server
3) Temporarily disable apps hitting DB
4) pg_dump DBs and template0, users, groups, etc over to temp server
5) Test to ensure DB is up and accepting connections
6) Change the alias entry in /etc/hosts on hosts running apps to the
record for the temp DB server
7) Re-enable apps
8) Re-install and configure OS on old DB server via kickstart
9) Temporarily disable apps hitting DB
10) pg_dump DBs and template0, users, groups, etc over to master server
11) Test to ensure master DB is up and accepting connections
12) Change the alias entry in /etc/hosts on servers running apps back to
the record for the master DB server
There, my 12 step program for a DB upgrade. :-)
This looks like a good start to the plan to me. I had mentioned
possibly using one of the app servers as a temporary DB server during
the upgrade of the DB server. People were receptive to the idea when I
mentioned it awhile back. Some of the ugprades have been fun, so not
having to worry as much about a very tight time limit should the upgrade
of the OS go poorly can make it easier.
During the time I disabled the apps to copy over the DB to the temp
server, users just couldn't login to the web portal to change their
settings, which are written to the master, for a few minutes. I had to
ask, would this really be an issue for 5 min at 3am CST on a Sunday
morning? (assuming that the DBs can be copied in the span of 5 min, I
have a VLAN'd GigE management network for this sort of thing) The same
downtime was experienced when I switched back over to the master DB.
The downtime for the swap to a temp server and then back to the real
server shouldn't be a major issue. Obviously the less downtime the
better, but the little bits of downtime doing those flip-flops would be
less than just taking the DB box down completely during the upgrade.
Elliot, feel free to correct me here! ;)
Since I'm assuming we're upgrading from Postgres 7.x we'll definitely
want to do a pg_dump/pg_restore since there are some inherent
differences in the data structures on the disk.
Elliot had expressed interest at going to an 8.x version.
Thanks!
Jeffrey