Hi, folks. At the last QA meeting, I volunteered (dumb of me!) to draft a policy for testing updates - basically, a policy for what kind of feedback should be posted in Bodhi for candidate updates. This turns out to be pretty hard. =) Thinking about it from an high-level perspective like this, I think it becomes pretty clear that the current system is just broken. The major problem is it attempts to balance things that don't really balance. It lets you say 'works for me' or 'doesn't work' and then sums the two and subtracts the second from the first to give you a 'rating' for the update. This doesn't really mean anything. As has been rehashed many times, there are situations where an update with a positive rating shouldn't go out, and situations where an update with a negative rating should. So the current system isn't really that great. I can't think of a way to draft a policy to guide the use of the current system in such a way that it will be really reliable. I think it'd be much more productive to revise the Bodhi feedback system alongside producing a policy. So, here's a summary of what the new system should aim for. At the high level, what is this system for? It's there for three purposes: 1) to provide maintainers with information they can use in deciding whether to push updates. 2) to provide a mechanism for mandating a certain minimum level of manual testing for 'important' packages, under Bill Nottingham's current update acceptance criteria proposal. 3) to provide an 'audit trail' we can use to look back on how the release of a particular update was handled, in the case where there are problems. Given the above, we need to capture the following types of feedback, as far as I can tell. I don't think there is any sensible way to assign numeric values to any of this feedback. I think we have to trust people to make sensible decisions as long as it's provided, in accordance with any policy we decide to implement on what character updates should have. 1. I have tried this update in my regular day-to-day use and seen no regressions. 2. I have tried this update in my regular day-to-day use and seen a regression: bug #XXXXXX. 3. (Where the update claims to fix bug #XXXXXX) I have tried this update and found that it does fix bug #XXXXXX. 4. (Where the update claims to fix bug #XXXXXX) I have tried this update and found that it does not fix bug #XXXXXX. 5. I have performed the following planned testing on the update: (link to test case / test plan) and it passes. 6. I have performed the following planned testing on the update: (link to test case / test plan) and it fails: bug #XXXXXX. Testers should be able to file multiple types of feedback in one operation - for instance, 4+1 (the update didn't fix the bug it claimed to, but doesn't seem to cause any regressions either). Ideally, the input of feedback should be 'guided' with a freeform element, so there's a space to enter bug numbers, for instance. There is one type of feedback we don't really want or need to capture: "I have tried this update and it doesn't fix bug #XXXXXX", where the update doesn't claim to fix that bug. This is a quite common '-1' in the current system, and one we should eliminate. I think Bill's proposed policy can be modified quite easily to fit this. All it would need to say is that for 'important' updates to be accepted, they would need to have one 'type 1' feedback from a proven tester, and no 'type 2' feedback from anyone (or something along those lines; this isn't the main thrust of my post, please don't sidetrack it too much :>). The system could do a count of how many of each type of feedback any given update has received, but I don't think there's any way we can sensibly do some kind of mathematical operation on those numbers and have a 'rating' for the update. Such a system would always give odd / undesirable results in some cases, I think (just as the current one does). I believe the above system would be sufficiently clear that there would be no need for such a number, and we would be able to evaluate updates properly based just on the information listed. What are everyone's thoughts on this? Thanks! -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Fedora Talk: adamwill AT fedoraproject DOT org http://www.happyassassin.net -- test mailing list test@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test