On Wed, Jan 04, 2012 at 06:22:41PM -0800, Keith Packard wrote: > On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter <daniel at ffwll.ch> wrote: > > > The "Correct?" was just to check my understanding of your concern, I still > > think its invalid. You've cut away the second part of my mail where I > > explain why and I don't see you responding to that here. Also > > micro-optimizing the gpu reset code sounds a bit strange. > > Sorry, I didn't explain things very well. > > Right now, our register access looks like: > > get(struct_mutex); > if (++forcewake_count == 1) > force_wake_get() > > value = read32(reg) or write32(reg, val) > > if (--forcewake_count == 0) > force_wake_put(); > > /* more register accesses may follow ... */ > put(struct_mutex); > > All very sensible, the whole register sequence is covered by > struct_mutex, which ensures that the forcewake is set across the > register access. > > The patch does: > > get(spin_lock) > if (++forcewake_count == 1) > force_wake_get() > put(spin_lock) > value = read32(reg) or write32(reg, val) > get(spin_lock) > if (--forcewake_count == 0) > force_wake_put() > put(spin_lock) > > I realize that all current users hold the struct_mutex across this whole > sequence, aside from the reset path, but we're removing that requirement > explicitly (the patch removes the WARN_ON calls). > > Without a lock held across the whole sequence, it's easy to see how a > race could occur. We're also dropping and re-acquiring a spinlock with a > single instruction between, which seems wasteful. Instead, we should be > doing: > > get(spin_lock) > force_wake_get(); > value = read32(reg) or write32(reg,val) > force_wake_put(); > put(spin_lock); > > No need here to deal with the wake lock at reset time; the whole > operation is atomic wrt interrupts. It's more efficient *and* correct, > without depending on the old struct_mutex locking. > > If you want to continue to expose the user-mode wake lock stuff, you > could add: > > get(spin_lock); > if (!forcewake_count) > force_wake_get(); > value = read32(reg) or write32(reg,val) > if (!forcewake_count) > force_wake_put(); > put(spin_lock); > > This would require that the reset code also deal with the > forcewake_count to restore the user-mode force wake. > > A further optimization would hold the force_wake active for 'a while' to > avoid the extra bus cycles required, but we'd want to see a performance > problem from this before doing that. I think you've lost me ... A few random comments first, it looks like I neeed more coffee: - The reset code (running from a workqueue) does hold sturct mutex. It's the hangcheck and error state capture code running from softirq/timer context causing issues. - Forcewake works like a reference counted resources. As long as all _get/_put calls are properly balanced, I don't see how somebody could sneak in in between (when we don't hold the spinlock) and cause havoc. For paranaoia we might want to drop a WARN_ON in the _put call to check whether it ever drops below 0. But all current users are clearly balanced, so I didn't bother with it. - My missed IRQ w/a actually relies on this by grabbing a forcewake ref in get_irq and dropping it again in put_irq. In between there's usually a schedule(). - I've pondered with Chris whether we should do your proposed optimization but we then noticed that the gem code doesn't actually read from any forcewake protected registers in normal execution (I don't consider running the hangcheck timer normal ;-). With my missed irq w/a that now changes, so we might need to reconsider this. But imo that's material for a separate patch. Yours, Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48