Re: [PATCH igt] igt: drop gem_storedw_loop from BAT

Daniel Vetter <daniel@xxxxxxxx> · Thu, 20 Oct 2016 14:50:00 +0200

On Thu, Oct 20, 2016 at 2:28 PM, Mika Kuoppala
<mika.kuoppala@xxxxxxxxxxxxxxx> wrote:
> Daniel Vetter <daniel@xxxxxxxx> writes:
>> On Thu, Oct 20, 2016 at 09:54:33AM +0100, Chris Wilson wrote:
>>> On Thu, Oct 20, 2016 at 11:45:47AM +0300, Petri Latvala wrote:
>>> > On Wed, Oct 19, 2016 at 08:26:17PM +0100, Chris Wilson wrote:
>>> > > The inter-engine synchronisation (with and without semaphores) is
>>> > > equally exercised by gem_sync, so leave gem_storedw_loop out of the
>>> > > "quick" set.
>>> >
>>> >
>>> > How equally is "equally"? Is the test actually redundant, should it be
>>> > removed altogether?
>>>
>>> The stress patterns exhibited by the test are identical to others in
>>> BAT. The accuracy tests are covered by others in BAT. The actual flow
>>> (edge coverage) will be subtly different and therefore the test is still
>>> unique and may catch future bugs not caught by others. But as far as BAT
>>> goes the likelihood of this catching something not caught by others
>>> within BAT is very very small.
>>
>> But given that we have 50k gem tests in full igt, does it really make
>> sense to keep it? Imo there's not much point in keeping around every
>> minute combinatorial variation if it means we can never run the full set
>> of testcases. Some serious trimming of the herd is probably called for.
>>
>> Joonas/Tvrtko/Mika and other gem folks: What's your stance here?
>
> No strong stances. But I really dont see the problem here from
> gem dev point of view. Only the maintenance burden of keeping
> latent/inactive testcases?
>
> Having more than we can possible run is a positive problem.
> We can pick more berries to basket instead of planting bushes.
>
> I throw this back by asking what is 'full igt'?

Atm we run about 0.1% of igt in CI. I think that's a problem, because
it means lots of testcases get written, but not used.

The other issue is that by not running testcases on a big set of
machines we don't catch the small bugs and races in the tests itself
(and there's plenty of those too). Which means they are of less
quality, and hence also not that useful for QA (as in quality
reporting). Which further reduces the value we can gain from having
these testcases.

The last one is a chicken-egg problem: Because no one runs this stuff,
it doesn't get better, and that's the reason it doesn't get run in CI,
which means no one runs it. Imo the only way to get out of that cycle
is by applying a _lot_ of survival pressure on testcases, to weed out
the stuff which is not providing a benefit. So yes I do believe we
have too many testcases, and it's costing us, because it's essentially
dead code.

So phrased another way: Why do we want to keep dead code in igt, but
are happy to rip out disabled-by-default features in the kernel? Why
is igt special and it's ok to treat it as a dumping ground and not
clean out deadwood?

I know that the shoddy shape of CI is part of the problem here, but
like I said above, igt is imo also part of the problem, and there's a
reinforcing cycle.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx