[Intel-gfx] [RFC v2 0/5] Waitboost drm syncobj waits

Thu Feb 16 11:19:09 UTC 2023

On 14/02/2023 19:14, Rob Clark wrote:
> On Fri, Feb 10, 2023 at 5:07 AM Tvrtko Ursulin
> <tvrtko.ursulin at linux.intel.com> wrote:
>>
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> In i915 we have this concept of "wait boosting" where we give a priority boost
>> for instance to fences which are actively waited upon from userspace. This has
>> it's pros and cons and can certainly be discussed at lenght. However fact is
>> some workloads really like it.
>>
>> Problem is that with the arrival of drm syncobj and a new userspace waiting
>> entry point it added, the waitboost mechanism was bypassed. Hence I cooked up
>> this mini series really (really) quickly to see if some discussion can be had.
>>
>> It adds a concept of "wait count" to dma fence, which is incremented for every
>> explicit dma_fence_enable_sw_signaling and dma_fence_add_wait_callback (like
>> dma_fence_add_callback but from explicit/userspace wait paths).
> 
> I was thinking about a similar thing, but in the context of dma_fence
> (or rather sync_file) fd poll()ing.  How does the kernel differentiate
> between "housekeeping" poll()ers that don't want to trigger boost but
> simply know when to do cleanup, and waiters who are waiting with some
> urgency.  I think we could use EPOLLPRI for this purpose.

Sounds plausible to allow distinguishing the two.

I wasn't aware one can set POLLPRI in pollfd.events but it appears it could be allowed:

/* Event types that can be polled for.  These bits may be set in `events'
    to indicate the interesting event types; they will appear in `revents'
    to indicate the status of the file descriptor.  */
#define POLLIN          0x001           /* There is data to read.  */
#define POLLPRI         0x002           /* There is urgent data to read.  */
#define POLLOUT         0x004           /* Writing now will not block.  */

> Not sure how that translates to waits via the syncobj.  But I think we
> want to let userspace give some hint about urgent vs housekeeping
> waits.

Probably DRM_SYNCOBJ_WAIT_FLAGS_<something>.

Both look easy additions on top of my series. It would be just a matter of dma_fence_add_callback vs dma_fence_add_wait_callback based on flags, as that's how I called the "explicit userspace wait" one.

It would require userspace changes to make use of it but that is probably okay, or even preferable, since it makes the thing less of a heuristic. What I don't know however is how feasible is to wire it up with say OpenCL, OpenGL or Vulkan, to allow application writers distinguish between house keeping vs performance sensitive waits.

> Also, on a related topic: https://lwn.net/Articles/868468/

Right, I missed that one.

One thing to mention is that my motivation here wasn't strictly waits relating to frame presentation but clvk workloads which constantly move between the CPU and GPU. Even outside the compute domain, I think this is a workload characteristic where waitboost in general helps. The concept of deadline could still be used I guess, just setting it for some artificially early value, when the actual time does not exist. But scanning that discussion seems the proposal got bogged down in interactions between mode setting and stuff?

Regards,

Tvrtko