Quoting Tvrtko Ursulin (2019-04-12 13:59:30) > > On 12/04/2019 09:53, Chris Wilson wrote: > > For controlling runtime pm of the GT and engines, we would like to have > > a callback to do extra work the first time we wake up and the last time > > we drop the wakeref. This first/last access needs serialisation and so > > we encompass a mutex with the regular intel_wakeref_t tracker. > > > > v2: Drop the _once naming and report the errors. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc; Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/Makefile | 1 + > > drivers/gpu/drm/i915/Makefile.header-test | 3 +- > > drivers/gpu/drm/i915/i915_drv.h | 3 +- > > drivers/gpu/drm/i915/intel_wakeref.c | 62 ++++++++++ > > drivers/gpu/drm/i915/intel_wakeref.h | 133 ++++++++++++++++++++++ > > 5 files changed, 199 insertions(+), 3 deletions(-) > > create mode 100644 drivers/gpu/drm/i915/intel_wakeref.c > > create mode 100644 drivers/gpu/drm/i915/intel_wakeref.h > > > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > > index 40130cf5c003..233bad5e361f 100644 > > --- a/drivers/gpu/drm/i915/Makefile > > +++ b/drivers/gpu/drm/i915/Makefile > > @@ -50,6 +50,7 @@ i915-y += i915_drv.o \ > > intel_device_info.o \ > > intel_pm.o \ > > intel_runtime_pm.o \ > > + intel_wakeref.o \ > > intel_uncore.o > > > > # core library code > > diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test > > index 96a5d90629ec..e6b3e7588860 100644 > > --- a/drivers/gpu/drm/i915/Makefile.header-test > > +++ b/drivers/gpu/drm/i915/Makefile.header-test > > @@ -31,7 +31,8 @@ header_test := \ > > intel_psr.h \ > > intel_sdvo.h \ > > intel_sprite.h \ > > - intel_tv.h > > + intel_tv.h \ > > + intel_wakeref.h > > > > quiet_cmd_header_test = HDRTEST $@ > > cmd_header_test = echo "\#include \"$(<F)\"" > $@ > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > index fad5306f07da..62a7e91acd7f 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -74,6 +74,7 @@ > > #include "intel_opregion.h" > > #include "intel_uc.h" > > #include "intel_uncore.h" > > +#include "intel_wakeref.h" > > #include "intel_wopcm.h" > > > > #include "i915_gem.h" > > @@ -134,8 +135,6 @@ bool i915_error_injected(void); > > __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \ > > fmt, ##__VA_ARGS__) > > > > -typedef depot_stack_handle_t intel_wakeref_t; > > - > > enum hpd_pin { > > HPD_NONE = 0, > > HPD_TV = HPD_NONE, /* TV is known to be unreliable */ > > diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c > > new file mode 100644 > > index 000000000000..f4cfaa154303 > > --- /dev/null > > +++ b/drivers/gpu/drm/i915/intel_wakeref.c > > @@ -0,0 +1,62 @@ > > +/* > > + * SPDX-License-Identifier: MIT > > + * > > + * Copyright © 2019 Intel Corporation > > + */ > > + > > +#include "intel_drv.h" > > +#include "intel_wakeref.h" > > + > > +int __intel_wakeref_get_first(struct drm_i915_private *i915, > > + struct intel_wakeref *wf, > > + int (*fn)(struct intel_wakeref *wf)) > > +{ > > + /* > > + * Treat get/put as different subclasses, as we may need to run > > + * the put callback from under the shrinker and do not want to > > + * cross-contanimate that callback with any extra work performed > > + * upon acquiring the wakeref. > > + */ > > So you want to seralize the get/put callbacks but don't want lockdep to > see any chains resulting from that. Sounds worrying. > > What if you moved them (callbacks) outside wf->mutex then and mandate > the callback pairs to serialize themselves? Doesn't work for the purpose -- we do want the global transition of first/last use serialised with all other users, not the callbacks themselves, and that requires the mutex across the entire 0<->1 transition. For example, we use the knowledge that there is a global/engine mutex held to insert a GPU context switch on parking, outside of a context where we would be allow to take any of the mutexes required for doing so. It's exactly the same problem we have with context pinning and elsewhere, intel_context_pin() is expensive and cannot allow allocations; intel_context_unpin() is cheap and may be called from the shrinker. > > + mutex_lock_nested(&wf->mutex, SINGLE_DEPTH_NESTING); > > + if (!atomic_read(&wf->count)) { > > + int err; > > + > > + wf->wakeref = intel_runtime_pm_get(i915); > > + > > + err = fn(wf); > > + if (unlikely(err)) { > > + intel_runtime_pm_put(i915, wf->wakeref); > > + mutex_unlock(&wf->mutex); > > + return err; > > + } > > + > > + smp_mb__before_atomic(); /* release wf->count */ > > These ones are like usually "What.." for me. > > According to the docs this line would suggest you want some other memory > write to become visible before the atomic_inc below. Which one? All the writes between atomic_read() and atomic_inc(). atomic_inc() broadcasts that you do not need the lock (or barrier) to see the results, so therefore all the results must be visible beforehand. We need to turn atomic_inc() into an atomic_inc_release/atomic_inc_unlock. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx