On Fri, Nov 04, 2016 at 08:03:57PM +0000, Chris Wilson wrote: > Flushing the cachelines for an object is slow, can be as much as 100ms > for a large framebuffer. We currently do this under the struct_mutex BKL > on execution or on pageflip. But now with the ability to add fences to > obj->resv for both flips and execbuf (and we naturally wait on the fence > before CPU access), we can move the clflush operation to a workqueue and > signal a fence for completion, thereby doing the work asynchronously and > not blocking the driver or its clients. > > Suggested-by: Akash Goel <akash.goel@xxxxxxxxx> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Akash Goel <akash.goel@xxxxxxxxx> Needs a bit more work to restrict the async operations. In the end, I think only the explicit paths towards execbuf / flip should opt in, as the majority will want sync (pread/pwrite/set-domain). This idea came up in a discussion on whether we needed create2 for early clflush or whether we could explot set-domain for the same functionality. Now, we can do the clflush asynchronously from create, but we must do it synchronously in set-domain (albeit now it could be done outside of the struct_mutex). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx