On Thu, 2017-07-06 at 21:15 -0400, Matthew Miller wrote: > First, there is gating from rel-eng and QA in progress here: > https://fedoraproject.org/wiki/Changes/NoMoreAlpha (Note that this is > compose/validation gating, not the CI stuff we're also talking about > separately.) That's key in keeping the release basically stable. But > there's another part: So, I guess I should set some more detailed expectations here. At least from my perspective on it. Compose gating is something that, in principle, we *can* do already. It's not very difficult. We have quite extensive automated testing of each compose, between openQA and autocloud. We have the results reported to resultsdb. It is not fundamentally difficult to write a thing which queries resultsdb for the test results for a given compose and makes a decision about whether that compose should be released, based on those results. However, there are some counterpoints. One, as always, there's devilry in the details. We have to decide *what* the criteria are, exactly. For autocloud, it's sort of 'easy', because autocloud results are not very granular at all - it's basically a straight up pass/fail for each image in a compose. But for openQA...well, I can draw up a list of the openQA tests that correspond to Alpha criteria, easily enough. But do we want to go straight there, or start smaller? Also, what do we do about...*complex* failures? For instance, I already know, right now, that every so often, anaconda just crashes in the middle of an install. It's been doing that for a year or so. It's a mysterious python crash, and it happens very rarely. There are also similar known 'it occasionally just crashes' bugs in GNOME and KDE - sometimes KDE startup just fails and the system sits at a black screen forever, occasionally GNOME crashes back to gdm shortly after login. And of course sometimes openQA just screws up (it's not perfect), and sometimes some network or mirror issue produces a failure, etc etc. So do we introduce some sort of 'fuzz factor' and say 'it's OK if 90% of the tests pass'? Do we set things up so the 'is it OK to ship?' calculation automatically re-runs every time a test is restarted, so we can just manually restart these kind of tests and if they pass, the compose will ship? Do I start writing some kind of complex openQA plugin to try and identify 'known intermittent failures' and automatically restart such tests? (this is the sort of thing that's possible, but could also eat my life.) There's another big point: I suspect that doing this kind of compose gating is going to *feel* like quite a big change to the distro development process. releng has been kinda gradually introducing more and more hurdles to the compose and sync process, but it's still *more or less* the case that people expect a Rawhide compose to succeed and sync every day - when composes fail for more than two or three days, people start getting antsy. If we add compose gating, I really don't think there's any possible outcome except that 'successful' composes get noticeably rarer. I can't put numbers on that yet; we could write a script which applies whatever criteria we decide on to the last, say, year of Rawhide composes and gives us an idea what percentage would've met the criteria, but of course that's not really a true representation because the act of introducing criteria provides a powerful incentive for people to fix problems which wasn't there before. All we can really say is that it's pretty likely that 'successful' Rawhide composes will become to some noticeable degree rarer. And I'm pretty sure that will have other consequences on how the whole process of working on Fedora feels. But it's difficult to be too specific until we actually do this. Honestly - I was kinda banking on us having a reasonable amount of the usual 'slack time' at the start of a release cycle to try and do a bit of a 'soft launch' of compose gating, with at least a few weeks for us to shake down all the details and get a feel for how significant the impact on the development process is. What you're talking about feels somewhat different. I'm not necessarily comfortable with us banking on the idea that compose gating is going to be something we can kick in *immediately* with complete success, to the degree of basing F28/F29 plans on it. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx