Fence Sync patches
James Jones
jajones at nvidia.com
Fri Dec 3 14:14:34 PST 2010
On Friday 03 December 2010 12:18:10 pm Keith Packard wrote:
> * PGP Signed by an unknown key
>
> On Fri, 03 Dec 2010 14:16:43 -0500, Owen Taylor <otaylor at redhat.com> wrote:
> > It's perhaps especially problematic in the case of the open source
> > drivers where the synchronization is already handled correctly without
> > this extra work and the extra work would just be a complete waste of
> > time. [*]
>
> I have hesitated to argue against this plan as it may be perceived as
> corporate bias. But, I would sure like to see something better than an
> argument from authority as to why this is necessary.
I appreciate the effort to remain unbiased, but always welcome any technical
feedback. In this case, the issues you bring up were already considered and
discussed on IRC.
For my part, I regret that this work may be seen by some as an NVIDIA-backed
attempt to dump functionality only needed by a closed-source driver on an
under-staffed open-source project. That is not my intention, and while the
discussion has centered around the one initial application of the changes that
currently only benefits our driver, I do think fence objects will be generally
useful on all platforms. Further, I'll be around to support this code, answer
questions about it, etc, as long as NVIDIA keeps paying me, and probably even
if they don't just because I find it interesting. Working on this code has
also taught me a lot about the X code base and development process, and I hope
I can help out more in the future with other issues too as a result. I'll be
applying for a fd.org account shortly.
> The trouble here is that all of the drivers we can look at don't need
> this to work efficiently, so there's no way to evaluate the
> design to see if it makes any sense at all.
>
> Requiring changes to all compositing managers just to support one driver
> seems like a failure in design, and it will be fragile as none of it
> will matter until the compositing manager is run against that driver.
I disagree that this is a failure of design. I always try to strike a balance
between the most efficient solution for the end users and the constraints of
development and maintenance. And we're used to being the odd man out. We
regularly need to provide patches or at least point out bugs in many projects
that only happen on our drivers for whatever reason (We support a different
set of GL extensions, we accelerate a different set of X render operations,
older apps relying on non-compliant SGI weirdness, newer apps relying on non-
compliant/undefined DRI/mesa/whatever behavior, etc.). It's not in our
interest to needlessly burden our users, but different isn't necessarily
always bad.
> > But it doesn't seem like a particularly efficient or low-latency way of
> > handling things even in the case of a driver with no built in
> > synchronization.
>
> I don't think it's all that different from the mechanisms used in the
> open source drivers, it's just that the open source drivers do the
> synchronization between multiple clients using the same objects
> automatically inside the kernel. It looks like there are about the same
> number of context switches, and a similar amount of user/kernel
> traffic. For the other drivers, using similar language, we do:
>
> Client => xserver [render this]
> X server => GPU [render this, fence A]
> X server => compositor [something was rendered]
> compositor => xserver [subtract damage]
> compositor => GPU [wait A, render this]
>
> With the explicit fencing solution, nothing appears on the screen until
> the X server queues the fence trigger to the GPU and that gets executed,
> so it may be that another client-xserver context switch is required,
> once per frame.
Agreed, as stated in my response to Owen, there should be little difference
overall.
> The question I have is that if these fence objects can be explicitly
> managed, why can't they be implicitly managed? Set a fence before
> delivering damage, wait for the fence before accessing those objects
> From another application, just as in the diagram above. The only
> rendering from the client that we're talking about is the back->front
> swap/copy operation, not exactly a high-frequency activity.
>
> That doesn't depend on having a single GPU hardware ring, just on having
> a kernel driver that tracks these fences for each object to insert
> appropriate inter-application synchronization operations. Heck, we can
> even tell you which drawables have damage objects registered so that you
> could avoid fencing anything except the Composite buffers for each
> window.
Couple of things here:
The open source drivers have chosen to perform this implicit synchronization.
However, it isn't required by any specification. GLX explicitly notes this
synchronization is not required in the second paragraph of the spec. The
burden is on clients to implement the synchronization with the assumption that
they will know how much synchronization is needed, and which is the most
efficient method of synchronization for their needs. I think that was a very
good design decision, and continues to be so. Just because new applications
and X extensions introduce new synchronization needs doesn't mean new forms of
implicit synchronization should be shoe-horned in at the driver level.
Rather, a clean form of explicit synchronization should be exposed to
applications that they can use in any way they see fit.
In theory, implicit synchronization would be possible in our driver model.
However, it would almost certainly be more work than adding and maintaining
this X extension and modifying a half-dozen composite managers to use it (I
consider the composite manager modifications needed fairly trivial, and as I
mentioned, I'll be updating at least one of them myself as an example). Our
kernel driver has very, very little awareness of surface management.
Thanks,
-James
> > [*] If it was documented that a Damage event didn't imply the rendering
> > had hit the GPU, then the X server could be changed not to flush
> > rendering before sending damage events.
>
> Doing the flush is a recent change; we had similar rendering issues with
> open source drivers until recently.
More information about the xorg-devel
mailing list