On Fri, Feb 13, 2015 at 02:12:48PM +0000, Chris Wilson wrote: > On Fri, Feb 13, 2015 at 02:43:40PM +0100, Daniel Vetter wrote: > > On Fri, Feb 13, 2015 at 12:59:45PM +0000, Chris Wilson wrote: > > > Long ago I found that I was getting sporadic errors when booting SNB, > > > with the symptom being that the first batch died with IPEHR != *ACTHD, > > > typically caused by the TLB being invalid. These magically disappeared > > > if I held the forcewake during the entire ring initialisation sequence. > > > (It can probably be shortened to a short critical section, but the whole > > > initialisation is full of register writes and so we would be taking and > > > releasing forcewake almost continually, and so holding it over the > > > entire sequence will probably be a net win!) > > > > > > Note some of the kernels I encounted the issue already had the deferred > > > forcewake release, so it is still relevant. > > > > > > I know that there have been a few other reports with similar failure > > > conditions on SNB, I think such as > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=80913 > > > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > > Given that we've already added a forcewake critical section around > > individual ring inits this makes maybe a bit too much sense. But I do > > wonder whether we don't need the same for resume and gpu resets? > > > > With the split into hw/sw setup we could get that by pusing the > > forcewake_get/put inti i915_gem_init_hw. Does the magic still work with > > that? And if we put it there there fw_get/put in init_ring_common is fully > > redundant and could be remove. > > Hmm, my original thought was to keep the engine alive from the first > programming of CTL up until we fed in the first request (which is the > ppgtt/context init). We can add a second forcewake layer into init_hw to > give the same security blanket for resume/reset. Sound reasonable? With the split into sw/hw setup init_hw should be all that's needed for coverage, nothing touches the hw outside of it. Hence I think the original outer layer is redundant. Does the magic still work if we drop that part, or have I missed some hw access (there really shouldn't be any)? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx