On Thu, Jan 20, 2011 at 2:25 AM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: > > Right, the autoreported HEAD may have been already reset to 0 and so hit > the wraparound bug which caused it to exit early without actually > quiescing the ringbuffer. Yeah, that would explain the issue. > Another possibility is that I added a 3s timeout waiting for a request if > IRQs were suspended: No, if IRQ's are actually suspended here, then that codepath is totally buggy and would blow up (msleep() doesn't work, and jiffies wouldn't advance on UP). So that's not it. > Both of those I think are symptoms of another problem, that perhaps during > suspend we are shutting down parts of the chip before idling? That could be, but looking at the code, one thing strikes me: the _normal_ case (of just waiting for "enough space" in the ring buffer) doesn't need to use the exact case, but the "wait for ring buffer to be totally empty" does. Which means that the use of the "fast-but-inaccurate" 'head' sounds wrong for the "wait for idle" case. So can you explain the difference between intel_read_status_page(ring, 4); vs I915_READ_HEAD(ring); because from looking at the code, I get the notion that "intel_read_status_page()" may not be exact. But what happens if that inexact value matches our cached ring->actual_head, so we never even try to read the exact case? Does it _stay_ inexact for arbitrarily long times? If so, we might wait for the ring to empty forever (well, until the timeout - the behavior I see), even though the ring really _is_ empty. No? Also, isn't that "head < ring->actual_head" buggy? What about the overflow case? Not that we care, because afaik, 'actual_head' is not actually used anywhere, so it should be called 'pointless_head'? That code looks suspiciously bogus. Linus _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel