Re: Review Request: Initial Fedup CLI Test Case

Adam Williamson <awilliam@xxxxxxxxxx> · Thu, 22 Nov 2012 11:17:55 -0800

On Thu, 2012-11-22 at 11:03 -0700, Tim Flink wrote:

> The problem with our release validation process is where to draw the
> line between what we test and what we don't. Our test cases are pretty
> much human bash scripts and that seems to be what people expect. If
> you follow all the steps of a test case, you can mark it pass/fail and
> be done with it; no improvisation or filling in unwritten stuff should
> be required. I'm personally not a fan of this testing style but it is
> what it is and I choose my battles - test case writing style is not one
> that I'm choosing to fight right now.

There's a clear benefit to the human bash script style which I think is
important in the context of release validation. It ensures (at least,
tries to - despite our relatively strict test cases, we _still_ have
problems with this) that despite the fact we have different people in
different places in different time zones running the tests on different
machines, they're all broadly testing the same thing.

If our test cases were less deterministic, it'd be very difficult to
compare results. If the upgrade test case just said effectively 'run an
upgrade however you like', and we saw 'Fail' from Bob for TC2, 'Pass'
from Anna for TC4, 'Pass' from James for TC6, 'Fail' from Bob for TC8,
and 'Pass' from Anna for RC1, what conclusions could we draw about how
the install test case behaviour had changed over those releases? Very
few or none, without careful analysis of any filed bugs, because we'd
have no idea if all those people had been testing the same thing, or
testing really different configurations.

Even though we try to be human-bash-script-y at present, the fact that
it's really difficult to nail down all environmental factors already
often gives us trouble replicating each other's results - we have lots
of 'how did you set up XYZ? was your repo on this server or that
server?' stuff flying around on IRC. If our tests were less
deterministic I think we'd find it even harder. We might gain a bit more
coverage, but at the cost of making it more or less impossible to
confidently compare test results from different testers across releases.

> With the testing resources we have right now, I think that limiting the
> upgrade test case to the default package set is our best option. Does
> it cover everything? No. Does it cover the most common issues? Probably.
> It does make some assurances that the fedup process is at least working
> for a known base case.

Right, this is the main point here - as both you and Johann said, the
limitation of the scope of the upgrade test is entirely intentional, and
an acknowledgement of the size of the problem space and the resources we
have to apply to it.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora
http://www.happyassassin.net

-- 
test mailing list
test@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test