On Wed, May 08, 2019 at 09:30:42PM +0100, Chris Wilson wrote: > Quoting Daniel Vetter (2019-05-08 13:53:30) > > On Wed, May 8, 2019 at 2:06 PM Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: > > > > > > Currently there is an underlying assumption that i915_request_unsubmit() > > > is synchronous wrt the GPU -- that is the request is no longer in flight > > > as we remove it. In the near future that may change, and this may upset > > > our signaling as we can process an interrupt for that request while it > > > is no longer in flight. > > > > > > CPU0 CPU1 > > > intel_engine_breadcrumbs_irq > > > (queue request completion) > > > i915_request_cancel_signaling > > > ... ... > > > i915_request_enable_signaling > > > dma_fence_signal > > > > > > Hence in the time it took us to drop the lock to signal the request, a > > > preemption event may have occurred and re-queued the request. In the > > > process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and > > > so reused the rq->signal_link that was in use on CPU0, leading to bad > > > pointer chasing in intel_engine_breadcrumbs_irq. > > > > > > A related issue was that if someone started listening for a signal on a > > > completed but no longer in-flight request, we missed the opportunity to > > > immediately signal that request. > > > > > > Furthermore, as intel_contexts may be immediately released during > > > request retirement, in order to be entirely sure that > > > intel_engine_breadcrumbs_irq may no longer dereference the intel_context > > > (ce->signals and ce->signal_link), we must wait for irq spinlock. > > > > > > In order to prevent the race, we use a bit in the fence.flags to signal > > > the transfer onto the signal list inside intel_engine_breadcrumbs_irq. > > > For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then > > > quickly signals to any outside observer that the fence is indeed signaled. > > > > > > v2: Sketch out potential dma-fence API for manual signaling > > > v3: And the test_and_set_bit() > > > > > > Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > > > Ok chatted a bit with Chris on irc, and we already allow drivers to > > pass whatever spinlock is suitable for their request/fence > > handling/retiring, so allowing to split up this all makes sense I > > think. Has my ack on the approach. > > > > I think it'd be good to spell out the optimization this allows > > (synchronous cleanup instead of refcounting all. the. things.) plus > > showcase the link between the fence->lock pointer and the split-up > > __dma_signal functions in the kerneldoc. Definitely for the core > > patch. > > > > Also not sure why __notify and __timestamp has a double underscore in > > the middle. That color choice confuses me a bit :-) > > I like it for subphases. The overall action here is still the dma-fence > 'signal', but now we are doing the 'set timestamp', 'notify callbacks' > etc. Otherwise we gain a subject to the verb, 'signal_notify' which says > to go and signal the notify, rather than do the notifications; for me > the '__' breaks the association. Maybe dma_fence_signal_do_notifies. at the cost of 2 more letters in already long function names, I think _do_step1, _do_step2 are clearer ... Aside: Could we merge the timestampe and do_notify steps together, maybe with the spinlock in there? I think materially it doesn't matter whether we set the timestampe before or in the spinlock protected section, as long as we don't set it afterwards. And would simplify the interface a bit. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel