+ Petri On Thu, 2017-10-19 at 16:29 +0300, Martin Peres wrote: > On 19/10/17 12:51, Daniel Vetter wrote: > > CI gets upset about it resulting in an incomplete, let's skip it until > > that's fixed to avoid havoc in the CI farm. Of course this should/will > > be reverted as soon as we have a fix (similar to how we dealt with the > > snb-dies-in-blt-hangs issue). > > > > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: "Lofstedt, Marta" <marta.lofstedt@xxxxxxxxx> > > Cc: Martin Peres <martin.peres@xxxxxxxxxxxxxxx> > > References: https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@xxxxxxxxxxxxxxxxxxxxxx > > References: https://bugs.freedesktop.org/show_bug.cgi?id=103289 > > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxx> <SNIP> > So, let's recap the problem here: > - Any incomplete in sharded runs mean that the platform is unfit for > pre-merge (because any other test after will go from pass to notrun) > - We can't fix issues immediately, especially for old platforms > > This patch is sweeping the test under the rug by using the skip output, > which is not only hard to track, it is also misleading. > > After discussing with Marta, Arek and Petri, we found some consensus on > the following proposal (terminology is up for debate): > > - Introduce igt_dodge_on(cond, label): Report a pre-emptive 'fail' when > the condition is true. Make sure this is over-ridable with IGT_DODGE=0 > so as we can easily run these tests without recompiling them. Make this igt_skip_on_ci(cond) and require IGT_CI=1 to activate them. Much like with simulation. Still, a BIOS update to one of the CI machines might mean (if it's not now the case, not very far fetched for the future) that we go churn in the IGT codebase to drop bunch of these. That's not the optimal workflow I can think of when we're discussing a separate mailing list for IGT discussion and patches to make it more self-contained. Then we bind that new mailing list to our CI farm contents, and bind making fixes to the CI farm operation directly to the IGT reviewing bandwidth? I'm still thinking best way would be that CI would mask the known problematic ones from the failure/pass criteria, and then somebody actually looks at the masked on after their testing coverage is prioritized. I think IGT should try to provide a wide range of tests that are supposed to work on any certain hardware. If they don't, it's not a reason to change the tests itself. With the filter, we can grow the testing coverage for the new platforms, even if CI happens to have odd machines that may not pass those tests (and we may not have the resources to immediately fix those). All this without churning on the IGT codebase. But if this is the only technically viable solution in short-term, then so be it. I just see better options too. Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx