Quoting Tvrtko Ursulin (2019-11-26 17:11:04) > > On 26/11/2019 17:08, Tvrtko Ursulin wrote: > > > > On 26/11/2019 17:05, Chris Wilson wrote: > >> Quoting Tvrtko Ursulin (2019-11-26 17:00:53) > >>> > >>> On 26/11/2019 16:47, Chris Wilson wrote: > >>>> Following 58b4c1a07ada ("drm/i915: Reduce nested > >>>> prepare_remote_context() > >>>> to a trylock"), we punt to the caller if the local intel_context > >>>> happens to be busy as we try to rewrite the sseu (due to retiring in > >>>> another thread). As the interlude should be short, spin until the lock > >>>> is available. > >>>> > >>>> The regret for using mutex_trylock() and not an atomic insertion of the > >>>> barrier is growing... > >>>> > >>>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>>> --- > >>>> drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 5 ++++- > >>>> 1 file changed, 4 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c > >>>> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c > >>>> index 2ea4790f3721..571cc996577c 100644 > >>>> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c > >>>> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c > >>>> @@ -1197,7 +1197,10 @@ __sseu_test(const char *name, > >>>> if (ret) > >>>> goto out_pm; > >>>> - ret = intel_context_reconfigure_sseu(ce, sseu); > >>>> + do { > >>>> + ret = intel_context_reconfigure_sseu(ce, sseu); > >>>> + cond_resched(); > >>>> + } while (ret == -EAGAIN); > >>>> if (ret) > >>>> goto out_spin; > >>>> > >>> > >>> Hm I looked at the selftests, saw error is correctly propagated, and > >>> concluded it will be fine. I missed the problem selftests will not > >>> actually retry. But wait, can we even count that userspace will if all > >>> of a sudden ctx.set_param starts returning -EAGAIN sporadically? Feels > >>> like we may need to revert. > >> > >> We invoke the principle of drmIoctl() catches -EAGAIN. > > > > I'm not comfortable with that. :( Not least how we are saying not to use > > libdrm. man 2 ioctl does not mention -EAGAIN. :( > > Or duct tape by looping in set_sseu as well? Could do, the actual atomic insertion doesn't look to horrendous (albeit yet another atomic op along that path), but will take a fair amount of rationalising. Certainly another cup of tea worth. diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index dca15ace88f6..181c6a3aaafa 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -753,11 +753,13 @@ __i915_active_fence_set(struct i915_active_fence *active, struct dma_fence *prev; unsigned long flags; - /* NB: must be serialised by an outer timeline mutex (active->lock) */ + if (fence == rcu_access_pointer(active->fence)) + return fence; + spin_lock_irqsave(fence->lock, flags); GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)); - prev = rcu_dereference_protected(active->fence, active_is_held(active)); + prev = xchg((struct dma_fence ** __force)&active->fence, fence); if (prev) { GEM_BUG_ON(prev == fence); spin_lock_nested(prev->lock, SINGLE_DEPTH_NESTING); @@ -775,12 +777,13 @@ __i915_active_fence_set(struct i915_active_fence *active, * our list_del() [decoupling prev from the callback] or * the callback... */ - prev = rcu_access_pointer(active->fence); + if (!rcu_access_pointer(active->fence)) { + RCU_INIT_POINTER(active->fence, fence); + prev = NULL; + } } - rcu_assign_pointer(active->fence, fence); list_add_tail(&active->cb.node, &fence->cb_list); - spin_unlock_irqrestore(fence->lock, flags); return prev; -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx