0. Plan time in infrastructure@xxxxxxxxxxxxxxxxxxxxxxxx 1. Open ticket on infrastructure for downtime. Updates will occur during day Reboots will occur during evening 2. Send email to devel-announce, announce, infrastructure 3. Update servers during working hours and work out issues in ticket. ** releng updates the following boxes: cvs01, pkgs01, nfs01, bnfs01, bxen*, x86-*, ppc*, koji*, db03, xb-01, compose-*, sign-vault01 4. Change DNS to turn off proxy on bodhost01 (or similar external proxy server). 5. Reboot bodhost01 6. Confirm proxy is working on bodhost/fix issues. 7. Change proxy dns to only go to bodhost01 8. Turn off nagios for servers. 9. Turn off nagios-external for services. 10. Reboot order counts 11. releng deals with the boxes listed above unless told otherwise. 12. reboots with database servers first xen15: db02 xen12: db01 13. reboot PHX2 boxes xen03: xen04: xen06: xen07: xen09: xen10: xen11: xen13: backup01: 14. reboot Outside boxes (can be in parallel to PHX2) cnode01: cnode02: cnode03: ibiblio01: internetx01: osuosl01: people01: serverbeach1: serverbeach2: serverbeach3: serverbeach4: serverbeach5: telia1: tummy1: 15. reboot bastion.fedoraproject.org log into bastion1 from outside system log into bastion2 from outside world log into xen05 from bastion01 bastion01: sudo su /usr/sbin/puppetd --disable sudo su /sbin/service openvpn start bastion02 sudo su /sbin/service openvpn start xen05 sudo /sbin/shutdown -r now once xen05/bastion2 server is back up, we can bastion01: sudo su /sbin/service openvpn stop sudo su /usr/sbin/puppetd --enable 16. reboot puppet01 log into bastion2 from outside world ssh xen14 sudo /sbin/shutdown -r now 17. re-enable DNS for proxy servers test proxy servers from puppet01 edit dns in git puppet make ns1 18. re-enable nagios on internal/external 19. Setup transifex agent on app servers: app01 app02 app03 app04 app07 sudo -u transifex /var/lib/transifex/ssh-add.sh -f 20. Log and report problems to list. 21. Close ticket. -- Stephen J Smoogen. “The core skill of innovators is error recovery, not failure avoidance.” Randy Nelson, President of Pixar University. "We have a strategic plan. It's called doing things."" — Herb Kelleher, founder Southwest Airlines _______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/infrastructure