On Fri, 2014-01-24 at 19:26 +0100, Michael Schwendt wrote: > > * That update made it out to the stable updates! In other words, the > > draconian Update Policies that were enacted in a vain attempt to prevent > > such issues from happening utterly failed at catching this bug. > > Those policies are not "draconian" enough [1]. On erroneous belief that > a +1 from three different testers would mean that the update has seen > enough testing, the test update has been published with the default karma > threshold of +3. The testers have failed. It's too simple for testers to > rush through the voting in bodhi without testing the updates > painstakingly. "The faster the better" has lead to a fatal mistake in > this case. I think that's being unnecessarily harsh on the testers. It's not at all obvious to anyone that you ought to test update/install of another package in order to validate an update to selinux-policy-targeted . Hell, I don't do that. Hate to sound like a broken record, but really the problem here is just the complete lack of granularity in the karma system: to phrase it theoretically, we know there are a huge spectrum of meanings for both +1 and -1: +1 -- * I installed it and nothing blew up * I installed it, rebooted and nothing blew up * I installed it, ran the entire test suite, grabbed the source tarball and inspected it line-by-line for vulnerabilities, fuzz tested all the variable handling, then deployed it to my extensive test farm for a week and assessed the results * It fixes my bug, and I didn't test anything else * It fixes my bug, and nothing blew up * It fixes my bug, and...(you see where I'm going with this) * It installs, it works, maybe it fixes some bugs, but it also introduces this other regression * I like the update text / the update submitter / candy -1 -- * It failed to install * I installed it, and something blew up * I installed it, rebooted and something blew up * (etc) * It doesn't fix my bug (and that's the only bug the update was meant to fix) * It doesn't fix my bug (but the update also fixes 50 other bugs, successfully) * It doesn't fix this other bug I have that the update didn't even claim to fix * It installs, it works, maybe it fixes some bugs, but it also introduces this other new bug (yes. this is identical to one of the +1 entries. That is the point. The same thing can also be registered as 0, giving us the perfect set. Depending on the details of what's fixed and what's broken, and the individual karma submitter's instincts, it can seem 'right' to file this as any one of the three possible values.) * It installs, it works, it doesn't exactly introduce any bugs, but I think it is not compliant with the update policy (i.e. too drastic a change in behaviour from the previous package) * I don't like the update text / the update submitter / candy The 'comment' field exists to allow people to express all these things, but as it's just a completely free-form text field, it's intrinsically impossible to really base any programmatic stuff or even policy on it. In theory maintainers could submit updates without using autokarma and then keep a careful eye on the feedback and 'tend' their updates manually, but I think it's pretty clear that in practice, this is not what happens: maintainers really want to be able to use the karma system as a 'helper', they want to farm out the evaluation process to Bodhi/the karma system. But our current system is too stupid to handle this perfectly, so we get these breakdowns. With a more flexible karma system we have a *lot* of opportunity to do much cleverer stuff. We can provide presets for all the above different things that are currently commonly expressed via +1 or -1 with a comment. This opens up possibilities at two different levels: the distro policy level, and the packager level. We can make the distro policy much more fine-grained, if we want to - we can require certain of the 'karma types' to be available in all updates, and for instance, block any update where X people pull the 'it's completely busted' or 'it introduces a security vulnerability' cord, regardless of how much broadly-categorized 'positive' karma it has. At the packager level, the packager gets the freedom to define a much more fine-grained policy for when they're happy that updates to their package are 'good to go', but they still don't have to sit there reading the emails and manually interpreting what people have written. You get to define the policy that makes the most sense for your package, within the confines of the distro-wide policy - if you have a good package-specific test suite, you can say to the auto-karma system 'don't send this update out until at least one person sets the "I ran the test suite and it passed" karma property. Those are just examples: the point is that what we badly need here is a more expressive and flexible system. (As well, as I've said elsewhere in the discussion, as a good automated test for this specific and well-known category of 'delayed action' update problems). -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct