Quoting Tvrtko Ursulin (2020-02-11 13:41:22) > > On 10/02/2020 20:57, Chris Wilson wrote: > > +static void kill_context(struct i915_gem_context *ctx) > > +{ > > + if (!list_empty(&ctx->stale.engines)) > > + kill_stale_engines(ctx); > > Lets see.. set_engines can freely race with context_close. The former is > adding entries to the list under the lock, the latter is here inspecting > list state unlocked. But then proceeds to lock it and all is good if > false negative are not an issue. But looks like it could happen and then > it fails to clean up. All that is needed is for this thread to not see > the addition to the list. And since this is not a hot path how about you > just always call kill_state_engines? Hmm. I didn't consider the race between close context and set-engines. We would also need to reject the late addition of engines to a closed context under the spinlock. Ta. > > #endif /* __I915_GEM_CONTEXT_TYPES_H__ */ > > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c > > index 51ba97daf2a0..bc6d4f8b78f0 100644 > > --- a/drivers/gpu/drm/i915/i915_sw_fence.c > > +++ b/drivers/gpu/drm/i915/i915_sw_fence.c > > @@ -211,10 +211,23 @@ void i915_sw_fence_complete(struct i915_sw_fence *fence) > > __i915_sw_fence_complete(fence, NULL); > > } > > > > -void i915_sw_fence_await(struct i915_sw_fence *fence) > > +bool i915_sw_fence_await(struct i915_sw_fence *fence) > > { > > - debug_fence_assert(fence); > > - WARN_ON(atomic_inc_return(&fence->pending) <= 1); > > + int old, new; > > + > > + /* > > + * It is only safe to add a new await to the fence while it has > > + * not yet been signaled. > > + */ > > + new = atomic_read(&fence->pending); > > + do { > > + if (new < 1) > > + return false; > > + > > + old = new++; > > + } while ((new = atomic_cmpxchg(&fence->pending, old, new)) != old); > > Simplify with atomic_try_cmpxchg? I was under the mistaken impression we didn't have atomic_try_cmpxchg. > I need a refresher on sw_fence->pending. (See your new comments and > raise you lack of old! ;) > > -1 = completed > 0 = ?? -1 = completed (all listeners awoken) 0 = signaled 1+ = pending waits (and yes we always start with 1 pending wait on behalf of the caller) > 1 = new, not waited upon > 2 = waited upon Maybe we don't really need -1? That was originally to avoid passing FENCE_COMPLETE, FENCE_FREE but since we have the state, we don't need to encode it? That would lead to a few small simplifications. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx