On Sat, Mar 15, 2014 at 4:46 PM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: > On Sat, Mar 15, 2014 at 12:08:55AM +0100, Daniel Vetter wrote: >> There's an entire pile of issues in here: >> >> - Use the main RING_HEAD register, not ACTHD. ACTHD points at the gtt >> offset of the batch buffer when a batch is executed. Semaphores are >> always emitted to the main ring, so we always want to look at that. > > True, nice catch that would explain a hard hang quite neatly if we tried > to read from beyond the aperture. > >> - Mask the obtained HEAD pointer with the actual ring size, which is >> much smaller. Together with the above issue this resulted us in >> trying to dereference a pointer way outside of the ring mmio >> mapping. The resulting invalid access in interrupt context >> (hangcheck is executed from timers) lead to a full blown kernel >> panic. The fbcon panic handler then tried to frob our driver harder, >> resulting in a full machine hang at least on my snb here where I've >> stumbled over this. >> >> - Handle ring wrapping correctly and be a bit more explicit about how >> many dwords we're scanning. We probably should also scan more than >> just 4 ... > > You can ignore ring wrapping as valid commands cannot wrap, hence the > formulation of acthd_min. Well I've thought that HEAD usually points at the next command. I'm not sure about the exact semantics at wrap-around time but since implementing it was trivially I've figured it's better to be paranoid. > /* > * Be paranoid and presume the hw has gone off into the wild - > * our ring is smaller than what the hardware (and hence > * HEAD_ADDR) allows. > */ > head = I915_READ_HEAD(ring) & (ring->size - 1); > head_min = max((int)head - 3 * 4, 0); > while (head >= head_min) { > cmd = ioread32(ring->virtual_start + head); > if (cmd == ipehr) > break; > head -= 4; > } Yeah, this would also fix the immediate bug at hand. But my general impression of this code is that it's trusting reconstructed state way too much. Hence why I've gone for the more wordy version of firsting oring out the right field from the register and then for restricting with the size of the ring. I'll do a follow-up patch to give the the same treatment to the semaphore signaller computation. >> + /* >> + * Be paranoid and presume the hw has gone off into the wild - >> + * our ring is smaller than what the hardware (and hence >> + * HEAD_ADDR) allows. Also handles wrap-around. >> + */ >> + head &= ring->size - 1; >> + >> + /* This here seems to blow up */ >> + cmd = ioread32(ring->virtual_start + head); > So what does this mean? Oops, the comment was just a note from my oops decoding. The ioread32 is where we've blown. The comment should be removed since it's not useful outside of my debug session. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx