Re: Requiring package test instructions (was: Re: Too fast karma on Bodhi updates)

Adam Williamson <adamwill@xxxxxxxxxxxxxxxxx> · Tue, 12 Jul 2016 12:16:59 -0700

On Wed, 2016-07-13 at 00:18 +0530, Siddhesh Poyarekar wrote:
> On Tue, Jul 12, 2016 at 11:38:01AM -0700, Adam Williamson wrote:
> > 
> > This isn't really correct, because there is no simple relationship
> > between 'bugs claimed to be fixed actually are fixed' and 'update
> > should be released'. Both of these are possible:
> > 
> > 1) an update which fixes the bugs it claims to fix, but should *NOT* be
> > released
> > 2) an update which does not fix all the bugs it claims to fix, but
> > *SHOULD* be released
> > 
> > An example of 1) is an update which claims to fix a minor bug, and
> > does, but creates a *major* bug. e.g., fixes a typo in the package
> > description, but causes the app not to run at all. This update should
> > be given -1 karma (negative response to 'Is the update generally
> > functional?'), not +1.
> > 
> > An example of 2) is an update which claims to provides a critical
> > security fix and a trivial bug fix, and *does* fix the security issue,
> > but the trivial bug fix doesn't work. This update should be given +1
> > karma (positive response to 'Is the update generally functional?'), not
> > -1.
> 
> Sure, and I said nothing to contradict that.  A good test involves
> verification of fixes *and* regression tests, not one or the other.
> However using lack of regression tests as an excuse for not verifying
> fixes (and more importantly, still leaving positive feedback) is not
> acceptable.  The points you mention go into intricacies of testing
> feedback whereas I am talking about the very basics.

This is setting far too high a bar for a project like Fedora. We take
the feedback we can get, we are not in a position to demand all update
testers perform comprehensive testing of all possible facets of an
update. It is always important to bear in mind that Bodhi is a system
designed to do the best we can to provide some moderate level of update
quality checking, it has never been intended or expected that Bodhi
feedback is of professional-level quality and comprehensiveness.

> > 
> > This is in fact *why*, in Bodhi 2.0, there are separate feedback
> > entries for each individual bug listed by the update - so testers can
> > separate 'does or does not fix bug X' feedback from 'update generally
> > works' feedback.
> > 
> > It is possible (in my opinion) for a tester to reasonably provide 'Is
> > the update generally functional?' feedback without actually checking
> > all or even any of the claimed bug fixes, and I've done this myself
> > quite often. It can quite often be difficult (if the bug is in a very
> > complex use case) or impossible (if the bug is specific to e.g. a piece
> > of hardware you don't own) to check bug fixes.
> 
> Sure, such feedback is very useful, but I don't think it should be
> accompanied by a karma +1.  The 'update generally functional' should
> only have options of karma 0 and -1 IMO.

I'm sorry, but I still don't agree, and this would clearly be a major
departure from how the current system is actually designed and intended
to be used.

Note that *only* responses to 'generally functional' are actually used
to calculate the karma score. All the other feedback items have no
relevance to scoring or any kind of actual technical gating mechanism,
they are purely informational to the update submitter at this point in
time. If 'generally functional' had no +1, it would be impossible for
any update ever to be auto-pushed or pushed ahead of the 'waiting
period', because no update could ever get a positive karma score.

As a quick refresher, Bodhi 1.x did not have multiple feedback 'types'
(items in the Feedback section in the web UI), it simply let each
tester give a single +1 or -1 'vote', along with comments, for each
update. In Bodhi 2.x, the 'is the update generally functional' feedback
item exactly matches the entirety of what Bodhi 1.x allowed testers to
do; we *added on* the additional feedback items to provide a clear way
for testers to provide feedback on whether bugs were fixed and test
cases satisfied, without the confusion of whether a non-fixed bug
should result in a '-1'.

Bodhi 2 is generally designed to give us the ability to be smarter
about gating - in theory, we could have more complex policies than
using the simple numeric 'score' based on the 'generally functional'
feedback, like never autopushing an update which got a -1 for 'Does the
system's basic functionality continue to work after this update?'.

We can also allow customization at the package level ('all updates for
package foo must get a +1 for test case bar') or update level ('don't
push this update stable unless there's +2 for RHBZ #123456'). There's
actually an initial implementation of this in current Bodhi, but I
haven't seen many packagers using it. In most cases, none of the
feedback items besides 'Does the system's basic functionality continue
to work after this update?' actually does anything beyond communicate
the response to the update submitter.

I have revised https://fedoraproject.org/wiki/QA:Update_feedback_guidelines
again to better cover the Bodhi 2 design and provide specific guidance
on how to provide responses to the different feedback items; the
previous text hadn't really been updated for Bodhi 2 at all.
-- 

Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net
--
test mailing list
test@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe:
https://lists.fedoraproject.org/admin/lists/test@xxxxxxxxxxxxxxxxxxxxxxx