Hey folks! So here's an idea I was thinking about over the RH shutdown: I propose we gate stable release critical path updates on the openQA tests. Currently we run a set of ~50 tests on every critpath update. For an F33 update this is the set: https://openqa.fedoraproject.org/tests/overview?distri=fedora&version=33&build=Update-FEDORA-2021-4634e99fa0&groupid=2 The results are reported to Bodhi - you can see them on the "Automated Tests" tab - but don't automatically affect whether you can push the update...yet. :) I'm proposing we start gating these updates on those test results. We've actually discussed this before but always thought it was difficult or impossible because these tests aren't run on all updates, only critical path updates (and a small hand-tended list of additional packages to test updates for that's maintained in the scheduler). Greenwave doesn't have a "require this test to be passed or not present" feature, and it would actually be hard/not a good idea to do so, so we sort of thought we were stuck. But recently I was editing the Fedora greenwave config and realized there's actually a simple solution: Bodhi can just make *different greenwave queries* for critpath and non-critpath updates. We can have alternate greenwave "decision contexts" for critpath and non-critpath updates, and have the critpath one require passes for the openQA tests. That solves the problem neatly, AFAICS. The result of this would be that critpath updates could not go stable if any of the openQA tests failed, unless a waiver was issued. I think this should be viable and not cause any major issues. Implementing this would be relatively simple, and would involve two things: adding some new bits to Fedora's greenwave policy definition, and patching Bodhi to use a different decision_context for greenwave queries for non-critpath updates and critpath updates. I could have both those changes ready for review in a day, probably. It would be equally simple to revert the change if it turned out to be a bad idea. I've been monitoring the update test results ever since we started doing update testing, and I check *every* update test failure. Sometimes there's a test system issue or a non-bug change that makes the tests fail, and when that happens, I fix the problem and re-run the tests. Where the failure is a real bug, I investigate it and file a report to the appropriate place, then add a comment on the update explaining the issue, usually with negative karma. So we've been doing something similar to "gating" for years, just implemented manually :) I would like to make it real gating to avoid cases where an update with a bug gets pushed stable before I manage to file a comment; this has happened a few times to packages which tend to get feedback very fast, or if an update comes out over a weekend or something and I don't see the failure for a few days. There are a few other folks already with sufficient knowledge of the openQA system that they could investigate a failure if necessary - lruzicka and pwhalen, for instance - and we're happy to help others learn the process if they'd be interested. You can see the recent history of update tests here: https://openqa.fedoraproject.org/group_overview/2?limit_builds=100 there are more with a failure than would usually be the case. This is because of a bug in fwupd which causes GNOME Software to hang sometimes. I've spent this week investigating exactly that bug with hughsie. As part of that process we did find a workaround a couple of days ago; if we had gating enabled, I would have put that workaround into production to make the test pass reliably and not cause updates to be gated, and/or just restarted failed tests until they passed. There's also a change in how Cockpit renders a page that caused the recent Cockpit updates to get a failure each, so after I write this email I'll be updating that test and re-running it. That should give you a flavor of how things go in general. What do people think of this idea? Any questions? Thanks! -- Adam Williamson Fedora QA IRC: adamw | Twitter: adamw_ha https://www.happyassassin.net _______________________________________________ test mailing list -- test@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to test-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/test@xxxxxxxxxxxxxxxxxxxxxxx