On Wed, Apr 25, 2012 at 08:57, Daniel Vetter <daniel.vetter at ffwll.ch> wrote: > gpu reset is a very important piece of our infrastructure. > Unfortunately we only really it test by actually hanging the gpu, > which often has bad side-effects for the entire system. And the gpu > hang handling code is one of the rather complicated pieces of code we > have, consisting of > - hang detection > - error capture > - actual gpu reset > - reset of all the gem bookkeeping > - reinitialition of the entire gpu > > This patch adds a debugfs to selectively stopping rings by ceasing to > update the hw tail pointer, which will result in the gpu no longer > updating it's head pointer and eventually to the hangcheck firing. > This way we can exercise the gpu hang code under controlled conditions > without a dying gpu taking down the entire systems. > > Patch motivated by me forgetting to properly reinitialize ppgtt after > a gpu reset. > > Usage: > > echo $((1 << $ringnum)) > i915_ring_stop # stops one ring > > echo 0xffffffff > i915_ring_stop # stops all, future-proof version > > then run whatever testload is desired. i915_ring_stop automatically > resets after a gpu hang is detected to avoid hanging the gpu to fast > and declaring it wedged. > > v2: Incorporate feedback from Chris Wilson. > > v3: Add the missing cleanup. > > Signed-Off-by: Daniel Vetter <daniel.vetter at ffwll.ch> > Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk> > I think I sent my R-b for the previous series, but just in case, for the new one: Reviewed-By: Eugeni Dodonov <eugeni.dodonov at intel.com> With small bikeshed below: static ssize_t > +i915_ring_stop_read(struct file *filp, > + char __user *ubuf, > + size_t max, > + loff_t *ppos) > +{ > + struct drm_device *dev = filp->private_data; > + drm_i915_private_t *dev_priv = dev->dev_private; > + char buf[80]; > buf is 80 characters here, but > +static ssize_t > +i915_ring_stop_write(struct file *filp, > + const char __user *ubuf, > + size_t cnt, > + loff_t *ppos) > +{ > + struct drm_device *dev = filp->private_data; > + struct drm_i915_private *dev_priv = dev->dev_private; > + char buf[20]; > here it is 20.. I don't think we'll need more than 20 the way it is supposed to work, so maybe standardize it to 80 above as well for consistency? -- Eugeni Dodonov <http://eugeni.dodonov.net/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20120425/8097f714/attachment.htm>