On Mon, Jul 13, 2015 at 10:09:59AM +0000, Gore, Tim wrote: > > > Tim Gore > Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ > > > > -----Original Message----- > > From: Daniel Vetter [mailto:daniel.vetter@xxxxxxxx] On Behalf Of Daniel > > Vetter > > Sent: Monday, July 13, 2015 10:35 AM > > To: Gore, Tim > > Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Wood, Thomas; Kuoppala, Mika > > Subject: Re: [PATCH i-g-t] tests/gem_reset_stats.c: fix "ban" tests > > with scheduler > > > > On Fri, Jul 10, 2015 at 02:26:59PM +0100, tim.gore@xxxxxxxxx wrote: > > > From: Tim Gore <tim.gore@xxxxxxxxx> > > > > > > The tests for context banning fail when the gpu scheduler is enabled. > > > The test causes a hang (using an infinite loop > > > batch) and then queues up some work behind it on both the hanging > > > context and also on a second "good" context. On the "good" context it > > > queues up 2 batch buffers. After the hanging ring has been reset (not > > > a full gpu reset) the test checks the values of batch_active and > > > batch_pending returned by the i915_get_reset_stats_ioctl. For the > > > "good" > > > context it expects to see batch_pending == 2, because two batch > > > buffers we queued up behind the hang on this context. But, with the > > > scheduler enabled (android, gen8), one of these batch buffers is still > > > waiting in the scheduler and has not made it as far as the > > > ring->request_list, so this batch buffer is unaffected by > > > the ring reset, and batch_pending is only 1. > > > > > > I considered putting in a test for the scheduler being enabled, but > > > decided that a simpler solution is to only queue up 1 batch buffer on > > > the good context. This does not change the test logic in any way and > > > ensures that we should always have batch_pending=1, with or without > > > the scheduler. > > > > For the scheduler/tdr we probably need to split this up into two testcases > > each: > > - one where the 2nd batch depends upon the first (cross-context depency). > > - one where the 2nd batch doesn't depend upon the first (should execute > > unhampered with scheduler/tdr). > > > > Since we don't yet have a scheduler/tdr both of these will result in the same > > outcome (since there's always the temporal depency). But with your patch > > you only test the 2nd case (and incompletely, we should assert that the 2nd > > batch did run) and ignore the first case. > > > > In short this failure here is telling you that your test coverage for these > > features is lacking. The correct fix is not to mangle the existing, but fix it up to > > correctly cover the new expectations in all cases. And that should be done as > > part of the tdr/scheduler submission. > > -Daniel > > > Should gem_rest_stats be testing the operation of the scheduler? I would have > thought that the scheduler operation should have its own test(s). Gem_reset_stats > is just about the reset mechanism and the stats collected. I can add this test, just > seems like the wrong place. Well gem_reset_stats assumes that a hang will victimize all subsequent batches, irrespective of context. tdr/scheduler change that rather fundamentally (which is the entire point really, at least of tdr). So yeah we need to adjust those existing testcase. And I think it's clearer if we do that by changing the existing test-cases, that way the impact on existing features for new code is much clearer. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx