On Mon, Feb 04, 2013 at 04:04:36PM +0200, Mika Kuoppala wrote: > Hi, > > This patchset adds ioctl and related changes to allow > userspace to query about the context loss status of the > specified context. The aim is to provide enabler > for the ARB_Robustness GL extension. > > I post this as RFC to get ack/nack on the interface. > The interface is in 0004-drm-i915-add-i915_get_reset_status_ioctl.patch > > Also any feedback on the other parts of the patchset is most > welcomed. > > The i-g-t testcase can be found here: > https://github.com/mkuoppal/intel-gpu-tools/commits/arb-robustness Ok, I've only quickly scrolled through this but it looks rather sane, and the interface seems to be a fit for arb robustness. So tentative ack on the ioctl ;-) A few comments: The first patch imo makes sense conceptually, even though it's not required to implement the ioctl (only the test). This needs critical review from Chris though since he's really picky about the hangcheck misfiring or not detecting hangs. Might be that we also have some races left which have been papered over by the acthd checks ... Best to split this out for maximum scrunity. For the remaining stuff, I see three pieces: - refcounts for hw contexts: Imo this is rather independent and should go in early. If it doesn't apply without the other I think we should move it ahead in a rebase. - Infrastructure to figure out whom to blame for a hang. One thing I'd really love to see on top of your patches is adjusting the "hanging too fast" detection to only disable command execution for the offending fd and not kill the entire gpu unnecessarily. This should help a lot already today, without any userspace support. I think it'd be also good to have an i-g-t test for this which tries to hang the gpu with the hangman on e.g. the blit constantly until the gpu dies. And then open a new fd to check whether things still work. - Putting that blame to the right hw contexts and exposing it through the ioctl. Usual interface rules applies: We need to have userspace support more or less ready before merging, i.e. a rough cut at the robustness extension patches for mesa. Probably best if you can do this in collaboration with one of the gl guys in Helsinki. Cheers, Daniel > > Thanks, > --Mika > > Mika Kuoppala (7): > drm/i915: detect infinite loops in hang check > drm/i915: add struct i915_reset_stats > drm/i915: add reset_status for hw_contexts > drm/i915: add i915_get_reset_status_ioctl > drm/i915: add batch object and context to i915_add_request() > drm/i915: reference count for i915_hw_contexts > drm/i915: find guilty batch buffer on ring resets > > drivers/gpu/drm/i915/i915_dma.c | 3 +- > drivers/gpu/drm/i915/i915_drv.c | 42 +++++++++ > drivers/gpu/drm/i915/i915_drv.h | 41 +++++++-- > drivers/gpu/drm/i915/i915_gem.c | 119 ++++++++++++++++++++++++-- > drivers/gpu/drm/i915/i915_gem_context.c | 66 +++++++++++++-- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 ++- > drivers/gpu/drm/i915/i915_irq.c | 127 ++++++++++++++-------------- > drivers/gpu/drm/i915/intel_overlay.c | 5 +- > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.h | 4 + > include/uapi/drm/i915_drm.h | 19 +++++ > 11 files changed, 351 insertions(+), 88 deletions(-) > > -- > 1.7.9.5 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch