Re: [PATCH] drm/i915: Perform object clflushing asynchronously

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Fri, 4 Nov 2016 20:16:29 +0000



On Fri, Nov 04, 2016 at 08:03:57PM +0000, Chris Wilson wrote:
> Flushing the cachelines for an object is slow, can be as much as 100ms
> for a large framebuffer. We currently do this under the struct_mutex BKL
> on execution or on pageflip. But now with the ability to add fences to
> obj->resv for both flips and execbuf (and we naturally wait on the fence
> before CPU access), we can move the clflush operation to a workqueue and
> signal a fence for completion, thereby doing the work asynchronously and
> not blocking the driver or its clients.
> 
> Suggested-by: Akash Goel <akash.goel@xxxxxxxxx>
> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> Cc: Akash Goel <akash.goel@xxxxxxxxx>

Needs a bit more work to restrict the async operations. In the end, I
think only the explicit paths towards execbuf / flip should opt in,
as the majority will want sync (pread/pwrite/set-domain). This idea came
up in a discussion on whether we needed create2 for early clflush or
whether we could explot set-domain for the same functionality. Now, we
can do the clflush asynchronously from create, but we must do it
synchronously in set-domain (albeit now it could be done outside of the
struct_mutex).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx