Re: RFC - kernel selftest result documentation (KTAP)

Frank Rowand <frowand.list@xxxxxxxxx> · Sat, 20 Jun 2020 10:03:54 -0500

On 2020-06-20 01:44, David Gow wrote:
> On Sat, Jun 20, 2020 at 1:58 AM Frank Rowand <frowand.list@xxxxxxxxx> wrote:
>>
>> On 2020-06-16 07:08, Paolo Bonzini wrote:
>>> On 15/06/20 21:07, Bird, Tim wrote:
> 
>>>>>> Finally,
>>>>>>   - Should a SKIP result be 'ok' (TAP13 spec) or 'not ok' (current kselftest practice)?
>>>>>> See https://testanything.org/tap-version-13-specification.html
>>>>>
>>>>> Oh! I totally missed this. Uhm. I think "not ok" makes sense to me "it
>>>>> did not run successfully". ... but ... Uhhh ... how do XFAIL and SKIP
>>>>> relate? Neither SKIP nor XFAIL count toward failure, though, so both
>>>>> should be "ok"? I guess we should change it to "ok".
>>>
>>> See above for XFAIL.
>>>
>>> I initially raised the issue with "SKIP" because I have a lot of tests
>>> that depend on hardware availability---for example, a test that does not
>>> run on some processor kinds (e.g. on AMD, or old Intel)---and for those
>>> SKIP should be considered a success.
>>
>> No, SKIP should not be considered a success.  It should also not be considered
>> a failure.  Please do not blur the lines between success, failure, and
>> skipped.
> 

> I agree that skipped tests should be their own thing, separate from
> success and failure, but the way they tend to behave tends to be
> closer to a success than a failure.
> 
> I guess the important note here is that a suite of tests, some of
> which are SKIPped, can be listed as having passed, so long as none of
> them failed. So, the rule for "bubbling up" test results is that any
> failures cause the parent to fail, the parent is marked as skipped if
> _all_ subtests are skipped, and otherwise is marked as having
> succeeded. (Reversing the last part: having a suite be marked as
> skipped if _any_ of the subtests are skipped also makes sense, and has
> its advantages, but anecdotally seems less common in other systems.)

That really caught my attention as something to be captured in the spec.

My initial response was that bubbling up results is the domain of the
test analysis tools, not the test code.

If I were writing a test analysis tool, I would want the user to have
the ability to configure the bubble up rules.  Different use cases
would desire different rules.

My second response was to start thinking about whether the tests
themselves should have any sort of bubble up implemented.  I think
it is a very interesting question.  My current mindset is that
each test is independent, and their is not a concept of an umbrella
test that is the union of a set of subtests.  But maybe there is
value to umbrella tests.  If there is a concept of umbrella tests
then I think the spec should define how skip bubbles up.

> 
> The other really brave thing one could do to break from the TAP
> specification would be to add a "skipped" value alongside "ok" and
> "not ok", and get rid of the whole "SKIP" directive/comment stuff.
> Possibly not worth the departure from the spec, but it would sidestep
> part of the problem.

I like being brave in this case.  Elevating SKIP to be a peer of
"ok" and "not ok" provides a more clear model that SKIP is a first
class citizen.  It also removes the muddled thinking that the
current model promotes.

> 
> 
> Cheers,
> -- David
>