Re: [RFC i-g-t] igt: Test tagging support

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Thu, 29 Jun 2017 10:23:03 +0100

On 29/06/2017 09:49, Daniel Vetter wrote:
On Tue, Jun 27, 2017 at 03:03:08PM +0100, Tvrtko Ursulin wrote:

On 27/06/2017 14:18, Daniel Vetter wrote:
On Tue, Jun 27, 2017 at 12:59:33PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Proposal to add test tags as a replacement for separate test
list which can be difficult to maintain and get out of date.

Putting this maintanenace inline with tests makes it easier
to remember to update the, now implicit, lists, and also enables
richer test selection possibilities for the test runner.

Test tags are string tokens separated by spaces meaning of which
we decide by creating a convetion and helped by the suitable
helper macros.

For example tags can be things like: gem, kms, guc, flip,
semaphore, bz12345678, gt4e, etc..

At runtime this would look something like this:

    root@e31:~/intel-gpu-tools# tests/gem_sync --list-subtests-with-tags
    default TAGS="gem "
    store-default TAGS="gem "
    many-default TAGS="gem "
    forked-default TAGS="gem "
    forked-store-default TAGS="gem "
    render TAGS="gem "
    store-render TAGS="gem "
    many-render TAGS="gem "
    forked-render TAGS="gem "
    forked-store-render TAGS="gem "
    bsd TAGS="gem "
    store-bsd TAGS="gem "
    many-bsd TAGS="gem "
    forked-bsd TAGS="gem "
    forked-store-bsd TAGS="gem "
    bsd1 TAGS="gem "
    store-bsd1 TAGS="gem "
    many-bsd1 TAGS="gem "
    forked-bsd1 TAGS="gem "
    forked-store-bsd1 TAGS="gem "
    bsd2 TAGS="gem "
    store-bsd2 TAGS="gem "
    many-bsd2 TAGS="gem "
    forked-bsd2 TAGS="gem "
    forked-store-bsd2 TAGS="gem "
    blt TAGS="gem "
    store-blt TAGS="gem "
    many-blt TAGS="gem "
    forked-blt TAGS="gem "
    forked-store-blt TAGS="gem "
    vebox TAGS="gem "
    store-vebox TAGS="gem "
    many-vebox TAGS="gem "
    forked-vebox TAGS="gem "
    forked-store-vebox TAGS="gem "
    each TAGS="gem basic"
    store-each TAGS="gem basic"
    many-each TAGS="gem basic"
    forked-each TAGS="gem basic"
    forked-store-each TAGS="gem "
    all TAGS="gem basic"
    store-all TAGS="gem basic"
    forked-all TAGS="gem extended"
    forked-store-all TAGS="gem extended"

Test runner can then enable usages like:

    ./run-tests --include gem --exclude kms,stress

Which I think could really be useful and more flexible than name
based selection.

How is this different from name-based pattern matching? We could just
throw these tags into the names, which is pretty much what we do already.

Tags in names are not as flexible and clear. When do tags begin and names
start, how ugly it will look with names getting long etc.

It might be ugly, but if we end up with 2 incomplete/ugly ways to tag
tests, then we're worse of. You're RFC looks like that.

Damn it, the RFC only looks like that because it only converted a couple 
of tests cases and not all of them. Not because I am proposing to have 
two parallel systems.

This RFC implementation automatically adds "gem" and "kms" tags
to test binaries prefixed with those strings respectively.

Why do you want to filter for gem or kms only? If you want to locally test
for a feature, I'd expect you want a more focused test selection, using
naming patterns/exclusions. Or maybe just one test binary that you run.

Yep, and so I said early in the commit message. Here I forgot to explain
that this gem/kms tag adding I've put in the RFC is just for tests who do
not specify their own tags.

I've mentioned tags like gem, kms, guc, flip, semaphore, bz12345678, gt4e,
etc..

For CI this doesn't help us, since it doesn't give us a way to both
a) make sure new tests are by default in the extended/full/CI/whatever you
want to call it run
b) exclude the massive pile of stress tests we currently have and can't
run for logistical reasons.

I think it gives us both. You simply tag a new test with either
basic/extended/stress and it is automatically included in the correct run.

Adding some tags for stress tests, or nasty debug-hunting mode is what I
think we need, so that we can exclude them from default runs. Currently we
have two examples of those:

- kms_frontbuffer_tracking --show-hidden
- gem_concurrent_blt vs _all

Your patch doesn't address this, and since we have 2 independent

It does my introducing a stress keyword and it even marks gem_concurrent_
with it.

What about kms_frontbuffer_tracking? We already have 2 ways to mark stress
tests, now you add a third. And yours has the problem that testcases are

Is the problem that I did not convert kms_frontbuffer_tracking in the 
sketch of a tagging system or something else?

still run, even if tagged as "stress".

?? "run-tests --include basic,extended --exclude stress" is an example 
of the proposal for the new system. So I don't get this criticism. 
Tagging tests with "stress" exactly enables you not to run them.

inventions of some solution for this, it does seem to be a real problem. I
think tags would be a good way to fix this, but tags just to have more
structured names seems like lots of work for little gain (and we're not
really short on work to do).

Yes it is some work, but imho not so much.

Start with simply converting the current CI set which is not a long list.
Then have an intern convert the extended list to tags. :) At this point we
all work together to extend the extended list and tune the tests at the same
time. This work is needed anyway if we want to increase coverage and fix
tests, which we do want. So while we are doing that we can just tag them
appropriately.

Not sure we need an intern, how exactly is an intern going to make good
judgement calls when not even most developers know what all the tests do?
I certainly don't ...

Intern was simply a response to the criticism that converting the 
existing testslists to tags is a big job. It is a mindless job for a 
first phase which brings feature parity to where we are today. All tests 
which are mentioned in the extended test list simply needed to be 
converted to igt_gem|kms_subtest("extended", "test-name").

Adding additional fine-tuned tags is an additional job on top which is 
something we don't have today anyway. So that one is a job for developers.

I think this would work much better as an improvement if we reduce scope.
I think managing BAT as an explicit list is doable, and I don't think we
need to add/remove tests all that often (BAT is supposed to check whether
we can run validation runs, not validate itself).

The current accute problem we have is that we can filter out stress tests
and make them not run by default, without ending up with an unmanageable
whitelist. I think just introducing the stress tag (as a #define, not as a
random string, and with documentation) would be an actual real
improvement. At least if it's used to unify the other two ways we have to
auto-delist stress tests (without that it just makes things worse).

For general tagging we atm have the naming scheme, plus the feature
profile json (it's starting to get used I hope soonish). So if you want to
switch that over to tags we'd also need to figure out how to
integrate/replace the feature json stuff.

Just adding tags because tags sounds like a cool idea on top of all the
other things we do already still doesn't sound appealing to me.

It is not supposed to be on top, but to unify and replace.

But OK, I've tried to sell it and no one is liking is so fine. The only 
thing that bothers me that to me it sounds like criticism is missing the 
point, but I don't know how else to try and explain the system.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx