On Wed, Jan 18, 2012 at 09:51:30AM -0800, Eric Anholt wrote: > On Wed, 18 Jan 2012 11:17:52 +0000, Chris Wilson <chris at chris-wilson.co.uk> wrote: > > On Wed, 18 Jan 2012 01:24:26 +0100, Daniel Vetter <daniel at ffwll.ch> wrote: > > > On Wed, Jan 18, 2012 at 01:16:02AM +0100, CC wrote: > > > > I attached the error state. > > > > > > Nice one, your gpu seems to have simply disappeared. And the ringbuffer > > > contains a rather peculiar cmd sequence. Putting Chris (maybe he > > > recognizes the pattern) and Ben (he's got a patch in the works to dump a > > > debug register that might be interesting here) on cc. It's too late atm > > > for me to think about this some more. > > > > Not simply disappeared, someone clobbered it with an extremely large > > hammer. The GPU was killed by a stray write to address 0 which took out > > the render ring buffer and its hws page. So my first thought is a > > missing relocation, and i965g springs to mind. > > -Chris > > At one point there was a bug in Mesa that wrote to 0: > > commit dfada714f8db3deea2fea3583c3c166a78db1117 > Author: Eric Anholt <eric at anholt.net> > Date: Fri Jun 17 18:20:36 2011 -0700 > > i965/gen6: Use an BO instead of writing to address 0 for PIPE_CONTROL W/A. > > This was spectacularly unsafe. On my system, address 0 happens to be > the hardware status page for the render ring, and the first quadword > of that happens to contain nothing we ever look at, but I sure didn't > look forward to having to debug some day when, for example, the kernel > happened to bind the ringbuffer before binding the hwsp. Unfortunately the error_state contains more garbage than just one stray 0 write. So yeah, if this is due to the i965g gallium driver, that would explain things - otherwise I'm hoping for Ben's reworked gt fifo patch. The CS regs are all 0, indicating that the gpu isn't getting out of deep sleep anymore. -Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48