On 2/17/25 5:12 AM, Clement Verna wrote: > > > On Sun, 16 Feb 2025 at 13:52, Zbigniew Jędrzejewski-Szmek <zbyszek@xxxxxxxxx <mailto:zbyszek@xxxxxxxxx>> wrote: > > On Sat, Feb 15, 2025 at 11:11:49AM -0500, Dusty Mabe wrote: > > On 2/15/25 9:54 AM, Zbigniew Jędrzejewski-Szmek wrote: > > > On Fri, Feb 14, 2025 at 02:40:29PM -0800, Adam Williamson wrote: > > >> On Fri, 2025-02-14 at 16:31 -0500, Dusty Mabe wrote: > > >>> IMO the bar would only need to be that high if the user had no way to ignore the test results. > > >>> All gating does here (IIUC) is require them to do an extra step before it automatically flows > > >>> into the next rawhide compose. > > >> > > >> again, technically, yes, but *please* let's not train people to have a > > >> pavlovian reaction to waive failures, that is not the way. > > > > > > IMO, the bar for *gating* tests needs to be high. I think 95% true > > > positives would be a reasonable threshold. > > > > I can't promise 95% true positive rate. These aren't unit tests. They are system wide > > tests that try to test real world scenarios as much as possible. That does mean pulling > > things from github/quay/s3/Fedora infra/etc.. and thus flakes happen. Now, in our > > tests we do collect failures and Retry them. If a retry succeeds we take it as success > > and never report the failure at all. However there are parts of our pipeline that might > > not be so good at retrying. > > > > All I'm trying to say is that when you don't control everything it's hard to say with > > confidence something will be 95%. > > As AdamW wrote in the other part of the thread, OpenQA maintains a > false positive rate close to 0%. So it seems possible, even with our > somewhat unreliable infrastructure… > > I am worried about the high failure rate for the coreos tests. But it > is possible that if we make them gating, the reliability will improve. > I know that in case of systemd, there was a failure that affected quite > a few of the updates because it wasn't fixed immediately. If we blocked > the first update, the percentage of failures would be lower. So I > think it makes sense to try this… If after a few months with this > we still have too many updates blocked by gating, we can reevaluate. > > > I would be happy to provide a monthly report of failures so that we can measure the rate of false positives. > > > > As I promised before, maybe just work with us on it. These tests have been enabled for > > a while and I've only seen a handful of package maintainers look at the failures (you, > > Zbyszek, being one of them; thank you!). > > > > We do want them to be useful tests and I promise when a failure happens because of our > > infra or tests themselves being flakey we try to get it fixed. > > One more question: are packagers able to restart the tests? > > > @Dusty Mabe <mailto:dusty@xxxxxxxxxxxxx> will know better, but I don't think packagers can restart the tests currently. No. Not currently. But it is something we could look into enabling and/or making easier. Dusty -- _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue