On Tue, 03 Jan 2012 13:49:36 -0800, Ben Widawsky <ben at bwidawsk.net> wrote: > The atomic ops stuff was simply there to help reduce the races (even if > we don't have the lock, we can still safely increment the variable). It > should be safe to get rid of with the spinlock in place. > > My only gripe here is Chris shot down my earlier version of this patch > many moons ago :( The other way of tackling it would be not to take the forcewake during hangcheck at all, and engineer the hangcheck not to rely on the ring reads. For example, use seqno as the primary activity monitor, which only leaves the case of trying not to fire spuriously during a long batchbuffer. To counter that, you could optimistically do a raw read of ACTHD or simply rely on long timeouts. Any error recover should be moved to the error handling workqueue, so that we never attempt to write a register or modify the stuct without the struct_mutex. Reducing the granularity of struct_mutex and solving the contention with mode_config.lock over register access is the ultimate goal when reviewing the locking mess. -Chris -- Chris Wilson, Intel Open Source Technology Centre