Day one of the datacenter service migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greetings everyone. 

I thought I'd share with everyone how things went today and where we are
at on the datacenter service migration. :) 

We did get everything we planns to migrated today: 

* staging services shutdown and machines with those resources readied to
 be shipped out. 

* fedora-messaging/fedmsg buses/clusters. We ran into a brief hiccup
here as I hadn't properly issued the rabbitmq cluster certs with an
alternative name for 'rabbitmq.fedoraproject.org', but we found it and
fixed it. All consumers and producers should be connecting to the new
rabbitmq cluster in the new datacenter now. 

* notifications service. Since this service is brittle and ancient we
elected to just copy the vm over and adjust its settings for the new dc
after. I am not 100% sure it's functioning properly yet (as it takes a
while to start up), but it seems to be close. 

* pdc. This one was much longer than I expected, and I am sorry about
that as it prevents people from committing. It turns out the pdc
database is pretty gigantic, so it took a few hours to dump it out and
load it in the new db server. ;( It's up and working now. 

* mirrormanager. There's some stats reporting and a few crons that may
need twekaing, but I think the basic service is working. 

* Authenitcation stack (ipa, fas, ipsilon). I ran into a few snags here
with routing and vpns, but everything should be moved over and working
normally.

On the unplanned side it turned out to be more complex than I had though
to just move some of our openshift apps and not the others. Because of
that I made the decision to just move all the (user facing) ones today. 
That includes: fas, ipsilon, bodhi, compose-tracker, elections,
greenwave, waiverdb, mdapi and a few more non user facing ones. 

The openshift apps move caused a outage for the elections app (both a
short one while it was entirely down, and another short time when
authentication wasn't yet working), and additionally when bodhi was
moved it was inadvertently restarted before it's database was synced
over, so if you say a bunch of bodhi actions today where it said it was
pushing things to stable that were already stable, that was the cause.
;( I quicked stopped the app and resynced the db, and hopefully not much
damage was done. 

Tomorrow is the big day for koji and all it's assosicated services. 
This work will start at 15UTC, so if you have any builds to do, make
sure they finish before then. Any in progress builds will be canceled at
around 15UTC, you can resubmit them once things are back up. 

Thanks again for everyones patience with this move and hopefully we will
survive the week. :) 

kevin

Attachment: signature.asc
Description: PGP signature

_______________________________________________
infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx

[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux