[PATCH] Revert "Set DamageSetReportAfterOp to true for the damage extension" (#30260)

Wed Oct 20 11:35:46 PDT 2010

Thanks for this discussion.  More precise semantics are always better.
Comments below:

On Wed, Oct 20, 2010 at 11:05:29AM -0700, Keith Packard wrote:
> On Sun, 17 Oct 2010 09:58:50 -0700, Aaron Plattner <aplattner at nvidia.com> wrote:
> 
> > This change does not solve the problem it claims to solve because it
> > makes two assumptions about the driver that are not always true:
> > 
> > 1. that it sends rendering requests to the GPU immediately, which can
> >    kill performance on some GPUs, and
> > 2. that the GPU processes all requests from multiple clients in order,
> >    which is false for some GPUs.
> 
> When the compositing manager receives the damage event, it must be able
> to access content from all applications which resulted in the reported
> damage. How else can this ever work?

By sending X protocol to explicitly fence the GL rendering with the X
rendering, which is the whole purpose of James's XDamageSubtractAndTrigger
request.  Otherwise, the driver has to implicitly fence all rendering (and
probably all CUDA kernels that interoperate with X) for damage events,
which will not perform well.

> > All a Damage event tells you is that your next X request will be
> > procesed after the damage has landed, not that the damage has already
> > landed or even that the rendering commands that cause the damage have
> > been submitted to the GPU.
> 
> It doesn't require anything about the current state of the GPU, only
> that future rendering requests include the results of any rendering
> which resulted in the reported damage.

In the context of the Damage extension, "future rendering requests" means
future *X* rendering requests -- this is how you told me it worked when I
asked you about it.  Implicitly fencing all other rendering through any API
when a Damage event is generated is a spec change that we could make, but
it would require discussion.

Since X rendering all happens in order, it doesn't matter whether the
client gets the event before the driver sees the rendering request because
it will have done by the time it receives any later requests.

If your driver model can use the ReportAfter semantics to get implicit
fencing across APIs, then that seems like something you could enable in the
driver via a DamageCreate or DamageRegister hook.  It doesn't seem like it
belongs in the server.

> In a multi-queue rendering engine, you clearly need to perform some
> cross-queue synchronization so that the rendering performed by the
> compositing manager sees the rendering performed by other applications.
> 
> The switch to reporting events after performing the rendering operation
> ensured that the driver was notified about the event delivery before the
> client received the event (through the 'flush' callback). It's easy to
> believe that this notification is not what multi-queue hardware wants,
> and that perhaps some other mechanism would be preferred.

The flush callback sounds nifty, but it won't solve the problem for
multi-queue hardware.

> The switch to post-op damage reporting was teste on hardware which has
> only a single queue and for which ensuring that the shuffle of requests
> hitting the graphics hardware is synchronized with the shuffle of
> requests executed by the X server at appropriate times.
> 
> > In addition to the above, this commit also broke the Compiz
> > "Wallpaper" plugin.
> 
> That's just a bug in the damage code; it would be lovely to see the
> sequence of damage events delivered in both cases so we could fix
> it. There should be no difference in the generated damage events, only
> the sequence within the X server of when those events are generated, and
> in the absence of a full output buffer requiring an intermediate flush
> operation, there shouldn't even be a change in when those events are
> delivered to the application.

Sure, I agree that it should have worked, but I don't agree that the change
is necessary in the core server code.

> I don't see any way to construct correct Damage semantics without
> delaying the damage event delivery until after the rendering operation
> has been seen by the driver; the old code relied on the damage event
> getting stuck in the client output buffer.

The old (and new, with this change) Damage semantics just plain don't work
on multi-queue hardware with direct-rendering clients; it's a limitation in
the specification that's addressed by James's sync extension.

> -- 
> keith.packard at intel.com