A few times in the last week we have hit a state where kojira on koji02 times out and doesn't run newrepo tasks for buildroots. Restarting the httpd on koji01 seems to unstick it, but this is not a good even stop gap as we would then have to manually do that and people would get small windows when koji was down. So, some investigation from koji developers at least for now the solution is to increase the ssl timeout. It's currently 60s, but I have increased it to 180 in a hotfix. http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=ebb160dceef8ede2ce97b3b940987b798380dfea and http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=f99e19b0244cd7ff8be570a6df6b78f494eceb57 This is already applied on koji02 to get us out of an outage situation (no new buildroots means no new fedora), but with +1s, I will apply to koji01 as well and make sure the playbooks sync with the hosts kevin
Attachment:
pgp2lnsQN2SH1.pgp
Description: OpenPGP digital signature
_______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/infrastructure