On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter <daniel at ffwll.ch> wrote: > The "Correct?" was just to check my understanding of your concern, I still > think its invalid. You've cut away the second part of my mail where I > explain why and I don't see you responding to that here. Also > micro-optimizing the gpu reset code sounds a bit strange. Sorry, I didn't explain things very well. Right now, our register access looks like: get(struct_mutex); if (++forcewake_count == 1) force_wake_get() value = read32(reg) or write32(reg, val) if (--forcewake_count == 0) force_wake_put(); /* more register accesses may follow ... */ put(struct_mutex); All very sensible, the whole register sequence is covered by struct_mutex, which ensures that the forcewake is set across the register access. The patch does: get(spin_lock) if (++forcewake_count == 1) force_wake_get() put(spin_lock) value = read32(reg) or write32(reg, val) get(spin_lock) if (--forcewake_count == 0) force_wake_put() put(spin_lock) I realize that all current users hold the struct_mutex across this whole sequence, aside from the reset path, but we're removing that requirement explicitly (the patch removes the WARN_ON calls). Without a lock held across the whole sequence, it's easy to see how a race could occur. We're also dropping and re-acquiring a spinlock with a single instruction between, which seems wasteful. Instead, we should be doing: get(spin_lock) force_wake_get(); value = read32(reg) or write32(reg,val) force_wake_put(); put(spin_lock); No need here to deal with the wake lock at reset time; the whole operation is atomic wrt interrupts. It's more efficient *and* correct, without depending on the old struct_mutex locking. If you want to continue to expose the user-mode wake lock stuff, you could add: get(spin_lock); if (!forcewake_count) force_wake_get(); value = read32(reg) or write32(reg,val) if (!forcewake_count) force_wake_put(); put(spin_lock); This would require that the reset code also deal with the forcewake_count to restore the user-mode force wake. A further optimization would hold the force_wake active for 'a while' to avoid the extra bus cycles required, but we'd want to see a performance problem from this before doing that. -- keith.packard at intel.com -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 827 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20120104/626ccee7/attachment.pgp>