On Wed, Jan 12, 2011 at 08:06, seth vidal <skvidal@xxxxxxxxxxxxxxxxx> wrote: > Hi Everyone, > I took some notes while we were rebooting boxes I wanted to share them > with everyone for future outages. > > Ordering of the bounces: > 1. xen14: puppet is on there and if that is back up first we have a > place to stand for pushing out any changes (dns changes for example via > puppet) - xen14 takes about 4 minutes to restart/POST Most of the new IBM hardware can take 4-6 minutes to reboot. I don't know if there is some flags I should have put in it, but it is deadly slow. > Overall things to think about for the future: > 1. dumping a complete virsh list - including how much memory is actually > being used per vm per server before we start reboots > 2. checking what disks need fscks because of mounted time and doing > those earlier or separately. > 3. verifying that all running vms are: > a. intended to be running > b. have a config file > c. are set to autostart > 4. verifying that all NOT running vms are: > a. intended to be off > b. are NOT set to autostart looks good. I thought koji2 was running before the reboots but it may have been a ghost vm. > thoughts welcome. > -sv > > > > > _______________________________________________ > infrastructure mailing list > infrastructure@xxxxxxxxxxxxxxxxxxxxxxx > https://admin.fedoraproject.org/mailman/listinfo/infrastructure > -- Stephen J Smoogen. "The core skill of innovators is error recovery, not failure avoidance." Randy Nelson, President of Pixar University. "Let us be kind, one to another, for most of us are fighting a hard battle." -- Ian MacLaren _______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/infrastructure