Re: Gating Fedora updates on Fedora CoreOS CI

Dusty Mabe <dusty@xxxxxxxxxxxxx> · Mon, 17 Feb 2025 10:35:21 -0500

On 2/17/25 5:12 AM, Clement Verna wrote:
> 
> 
> On Sun, 16 Feb 2025 at 13:52, Zbigniew Jędrzejewski-Szmek <zbyszek@xxxxxxxxx <mailto:zbyszek@xxxxxxxxx>> wrote:
> 
>     On Sat, Feb 15, 2025 at 11:11:49AM -0500, Dusty Mabe wrote:
>     > On 2/15/25 9:54 AM, Zbigniew Jędrzejewski-Szmek wrote:
>     > > On Fri, Feb 14, 2025 at 02:40:29PM -0800, Adam Williamson wrote:
>     > >> On Fri, 2025-02-14 at 16:31 -0500, Dusty Mabe wrote:
>     > >>> IMO the bar would only need to be that high if the user had no way to ignore the test results.
>     > >>> All gating does here (IIUC) is require them to do an extra step before it automatically flows
>     > >>> into the next rawhide compose.
>     > >>
>     > >> again, technically, yes, but *please* let's not train people to have a
>     > >> pavlovian reaction to waive failures, that is not the way.
>     > >
>     > > IMO, the bar for *gating* tests needs to be high. I think 95% true
>     > > positives would be a reasonable threshold.
>     >
>     > I can't promise 95% true positive rate. These aren't unit tests. They are system wide
>     > tests that try to test real world scenarios as much as possible. That does mean pulling
>     > things from github/quay/s3/Fedora infra/etc.. and thus flakes happen. Now, in our
>     > tests we do collect failures and Retry them. If a retry succeeds we take it as success
>     > and never report the failure at all. However there are parts of our pipeline that might
>     > not be so good at retrying.
>     >
>     > All I'm trying to say is that when you don't control everything it's hard to say with
>     > confidence something will be 95%.
> 
>     As AdamW wrote in the other part of the thread, OpenQA maintains a
>     false positive rate close to 0%. So it seems possible, even with our
>     somewhat unreliable infrastructure…
> 
>     I am worried about the high failure rate for the coreos tests. But it
>     is possible that if we make them gating, the reliability will improve.
>     I know that in case of systemd, there was a failure that affected quite
>     a few of the updates because it wasn't fixed immediately. If we blocked
>     the first update, the percentage of failures would be lower. So I
>     think it makes sense to try this… If after a few months with this
>     we still have too many updates blocked by gating, we can reevaluate.
> 
> 
> I would be happy to provide a monthly report of failures so that we can measure the rate of false positives. 
> 
> 
>     > As I promised before, maybe just work with us on it. These tests have been enabled for
>     > a while and I've only seen a handful of package maintainers look at the failures (you,
>     > Zbyszek, being one of them; thank you!).
>     >
>     > We do want them to be useful tests and I promise when a failure happens because of our
>     > infra or tests themselves being flakey we try to get it fixed.
> 
>     One more question: are packagers able to restart the tests?
> 
> 
> @Dusty Mabe <mailto:dusty@xxxxxxxxxxxxx> will know better, but I don't think packagers can restart the tests currently.

No. Not currently. But it is something we could look into enabling and/or making easier.

Dusty
-- 
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue