Fedora CI for large RPMs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

The llvm rpm (https://src.fedoraproject.org/rpms/llvm) has recently been struggling with Fedora CI, by which I mean the CI that produces scratch builds and runs dist-git tests on merge requests.

The llvm rpm has a combination of long build times (typically 3-8 hours on koji, depending on arch) combined with many merge requests, which is where things break down. We've received multiple complaints that scratch builds for our MRs occasionally end up clogging s390x koji, due to a number of problems with Fedora CI.

 * Both Zuul and Fedora CI produce their own independent scratch builds, increasing load by 2x. I think this is tracked as part of https://pagure.io/fedora-ci/general/issue/476. This is the only problem we were able to address ourselves, by disabling Zuul.

* Fedora CI does not cancel old scratch builds when a new commit is pushed or the MR is rebased (https://pagure.io/fedora-ci/general/issue/493). This means that if some changes are pushed in response to MR feedback, you end up with an extra set of scratch builds running in parallel. This is further exacerbated by Pagure not having proper support for rebase merges, so if you hit Rebase and then Merge you also get a bonus scratch build. (I'm not sure whether Zuul properly cancels scratch builds, or whether it produces zombies as well.)

* Fedora CI has no configurability. For example, we can't disable just the s390x scratch builds (https://pagure.io/fedora-ci/general/issue/494), which tend to be more than twice as slow as other builds.

* As far as I know, it's not even possible to disable Fedora CI entirely to e.g. only use Zuul instead. Similarly, we can't stop automatically triggering Fedora CI and requiring manual [citest] instead.

* For scratch builds longer than 4 hours, Fedora CI will never report back the result (https://pagure.io/fedora-ci/general/issue/485), even though the scratch build continues running. It will stay in the pending state forever. For llvm all scratch builds take more than 4 hours, so we never get results. This also means that dist-git tests never run. I submitted a PR to raise this timeout (https://github.com/fedora-ci/dist-git-build-pipeline/pull/41) but wasn't able to get a response.

It's not really necessary to solve *all* of these problems -- I think the MVP to make MRs usable for llvm without negatively affecting other people would probably be to increase the timeout and either a) implement auto-cancellation for scratch builds or b) allow preventing auto-start of CI, requiring manual [citest]. (Naively, I assume the latter is easier to implement.)

However, I haven't been able to get any response from maintainers on Fedora CI issues or PRs, so I'm not really sure what to do here anymore, thus this mail to fedora-devel. I'd appreciate any pointers on how to move forward.

Regards,
Nikita
-- 
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux