Greetings all. This email will cover days 3 and 4, as by the time I was going to send yesterdays it was late and mailman was still down anyhow. :) So, yesterday started out seeming like a pretty simple day, but didn't turn out that way. We planned to move only two things and work on fixing issues from the buildsystem and other moves in the first two days. * datagrepper / datanommer. This took until this morning as the database is really gigantic. Again, we wanted to load it into a more modern postgres. Now that it's moved and on postgres 12.2, we will be looking into partitioning the data (perhaps by month? quarter?) so queries for anything recent are much faster. * mailman / lists: This turned out to be our biggest problem of the move. :( We are working on getting this install moved over to recent fedora or rhel, but for now it's rhel7 and python34. Because of that we decided to just copy the instance over entire and adjust it over a fresh install. The copy ran most of the day, and was nearing completion but then we acidentally resized the orig instance. :( We resized it back, but the filesystem was messed up and the instance would no longer boot. It was at this point we decided that lack of sleep could leed to poor decisions and mistakes and we started a copy off of the data on the copy to another freshly installed instance and went and got some sleep. The next day, in a stroke of luck, the copy we were doing had already copied all the disk that had data on it, so we were able to fsck it and resize it and we were back in business. mailman/lists was back up this morning and happily processing away. Today, in addition to finishing the above two migrations from yesterday, we moved: * openqa. Right now it doesn't have any arm or power workers, but we have some almost ready to go there that we should have in place next week. * Various openshift apps (docsbuilding, websites building, cron jobs, etc). We even have release-monitoring and the new hotness up and running. I am trying to bring koschei up as well, but it needs some more work. * Some small misc apps: blockerbugs, kerneltest, etc. * We also fixed tons and tons of issues all over the map. Mostly around things reaching other things or something not running for some configuration reason. At this point everything we planned to be in the minimal fedora should be up and working. We do have a more capacity than we need, so if things go smoothly without too many more things to fix, I'd like to see about bringing up badges as it's a popular app and if we have capacity and can easily do it we can bring it up. Tomorrow and this weekend we are going to work on taking things down in the old datacenter and get them ready for shipping next week. They will be in transit next week, then we hopefully can get them racked and built and start adding capacity back the week after. So, if you notice something not working now, please do look to see if there's already a ticket on it, and if not please file one. ( https://pagure.io/fedora-infrastructure/issues ). Overall things went pretty good from my view, and I would really like to thank the awesome fedora community for being patient with us. I was pretty surprised how few people asked why things were down and when they did other community memebers were quick to tell them. kevin
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx