Re: [Intel-gfx] [RFC v2 0/5] Waitboost drm syncobj waits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 14/02/2023 19:14, Rob Clark wrote:
On Fri, Feb 10, 2023 at 5:07 AM Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:

From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

In i915 we have this concept of "wait boosting" where we give a priority boost
for instance to fences which are actively waited upon from userspace. This has
it's pros and cons and can certainly be discussed at lenght. However fact is
some workloads really like it.

Problem is that with the arrival of drm syncobj and a new userspace waiting
entry point it added, the waitboost mechanism was bypassed. Hence I cooked up
this mini series really (really) quickly to see if some discussion can be had.

It adds a concept of "wait count" to dma fence, which is incremented for every
explicit dma_fence_enable_sw_signaling and dma_fence_add_wait_callback (like
dma_fence_add_callback but from explicit/userspace wait paths).

I was thinking about a similar thing, but in the context of dma_fence
(or rather sync_file) fd poll()ing.  How does the kernel differentiate
between "housekeeping" poll()ers that don't want to trigger boost but
simply know when to do cleanup, and waiters who are waiting with some
urgency.  I think we could use EPOLLPRI for this purpose.

Sounds plausible to allow distinguishing the two.

I wasn't aware one can set POLLPRI in pollfd.events but it appears it could be allowed:

/* Event types that can be polled for.  These bits may be set in `events'
   to indicate the interesting event types; they will appear in `revents'
   to indicate the status of the file descriptor.  */
#define POLLIN          0x001           /* There is data to read.  */
#define POLLPRI         0x002           /* There is urgent data to read.  */
#define POLLOUT         0x004           /* Writing now will not block.  */

Not sure how that translates to waits via the syncobj.  But I think we
want to let userspace give some hint about urgent vs housekeeping
waits.

Probably DRM_SYNCOBJ_WAIT_FLAGS_<something>.

Both look easy additions on top of my series. It would be just a matter of dma_fence_add_callback vs dma_fence_add_wait_callback based on flags, as that's how I called the "explicit userspace wait" one.

It would require userspace changes to make use of it but that is probably okay, or even preferable, since it makes the thing less of a heuristic. What I don't know however is how feasible is to wire it up with say OpenCL, OpenGL or Vulkan, to allow application writers distinguish between house keeping vs performance sensitive waits.

Also, on a related topic: https://lwn.net/Articles/868468/

Right, I missed that one.

One thing to mention is that my motivation here wasn't strictly waits relating to frame presentation but clvk workloads which constantly move between the CPU and GPU. Even outside the compute domain, I think this is a workload characteristic where waitboost in general helps. The concept of deadline could still be used I guess, just setting it for some artificially early value, when the actual time does not exist. But scanning that discussion seems the proposal got bogged down in interactions between mode setting and stuff?

Regards,

Tvrtko



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux