Fence Sync patches

Keith Packard keithp at keithp.com
Fri Dec 3 12:18:10 PST 2010

On Fri, 03 Dec 2010 14:16:43 -0500, Owen Taylor <otaylor at redhat.com> wrote:

> It's perhaps especially problematic in the case of the open source
> drivers where the synchronization is already handled correctly without
> this extra work and the extra work would just be a complete waste of
> time. [*]

I have hesitated to argue against this plan as it may be perceived as
corporate bias. But, I would sure like to see something better than an
argument from authority as to why this is necessary.

The trouble here is that all of the drivers we can look at don't need
this to work efficiently, so there's no way to evaluate the
design to see if it makes any sense at all.

Requiring changes to all compositing managers just to support one driver
seems like a failure in design, and it will be fragile as none of it
will matter until the compositing manager is run against that driver.

> But it doesn't seem like a particularly efficient or low-latency way of
> handling things even in the case of a driver with no built in
> synchronization.

I don't think it's all that different from the mechanisms used in the
open source drivers, it's just that the open source drivers do the
synchronization between multiple clients using the same objects
automatically inside the kernel. It looks like there are about the same
number of context switches, and a similar amount of user/kernel
traffic. For the other drivers, using similar language, we do:

    Client => xserver      [render this]
    X server => GPU        [render this, fence A]
    X server => compositor [something was rendered]
    compositor => xserver  [subtract damage]
    compositor => GPU      [wait A, render this]

With the explicit fencing solution, nothing appears on the screen until
the X server queues the fence trigger to the GPU and that gets executed,
so it may be that another client-xserver context switch is required,
once per frame.

The question I have is that if these fence objects can be explicitly
managed, why can't they be implicitly managed? Set a fence before
delivering damage, wait for the fence before accessing those objects
From another application, just as in the diagram above. The only
rendering from the client that we're talking about is the back->front
swap/copy operation, not exactly a high-frequency activity.

That doesn't depend on having a single GPU hardware ring, just on having
a kernel driver that tracks these fences for each object to insert
appropriate inter-application synchronization operations. Heck, we can
even tell you which drawables have damage objects registered so that you
could avoid fencing anything except the Composite buffers for each

> [*] If it was documented that a Damage event didn't imply the rendering
> had hit the GPU, then the X server could be changed not to flush
> rendering before sending damage events.

Doing the flush is a recent change; we had similar rendering issues with
open source drivers until recently.

keith.packard at intel.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg-devel/attachments/20101203/b94e6bc2/attachment.pgp>

More information about the xorg-devel mailing list