GLX redirection extension

Fri Jul 21 08:44:45 PDT 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Felix Bellaby wrote:
> On Thu, 2006-07-20 at 13:57 -0400, Kristian Høgsberg wrote:
>>On 7/20/06, Felix Bellaby <felix at bellaby.plus.com> wrote:
>>
>>>I think that I may have found a potentially useful way out of some
>>>problems within the current compositing framework. Basically, the server
>>>has information that the compositor would find useful.
>>
>>I'm not sure what problems you're trying to solve here, but we don't
>>need extra protocol for redirecting GLX.  When a window is redirected
>>with COMPOSITE, all rendering to that window is supposed to go to an
>>offscreen pixmap.
> 
> That is the principle problem. Rendering GL directly onto the screen is
> much faster than rendering into a pixmap and copying the pixmap onto the
> screen, especially if the second operation is performed using a texture
> map. The current situation allows you to use a composited desktop with
> 2D applications _OR_ use GL for serious applications like 3D modeling.
> You can not do both at the same time.
> 
> With GLX redirection, the compositor could allocate areas of the screen
> for use by GL applications and they could run at full speed unhampered
> by the draw -> pixmap -> texture -> screen set up. This is how direct
> GLX has worked under X in the past, and I can not see how this technique
> can be used in the future unless the compositor treats different apps in
> different ways based on their GLX requests. 

Right now rendering to pixmaps is not accelerated for GL.  We need to
fix that.  Kristian and I spent some time talking about how this might
be acomplished at DDC.  The "only" think that we really need is a way
for the DRI driver, whether it's loaded in the client or in the server, to:

1. Find out where the off-screen pixmap is located in card memory.  This
may require the server to migrate the pixmap back to the card.

2. Ask the core server to not migrate the pixmap off the card while it
is doing 3D rendering.

Once we can do those two things, the DRI driver will treat the pixmap
the same way that it treats any other rendering target.  Since the data
is still in a pixmap, the compistor can treat it just like any other window.

The big trick in the compositor is that you have to make sure the IDs
used for the hidden FBOs don't conflict with the IDs of FBOs used by the
application.  Since the application can specify any 32-bit integer for
and FBO ID that it wants, this is non-trivial.

The other issue is that we need to be able to do X rendering to a
window.  There is no way to do X rendering to an FBO, you can only do GL
rendering.  I think that's the deal breaker right there.

> Furthermore, rendering into an off screen pixmap currently fails to take
> full advantage of the capabilities of GL hardware. If the compositor /
> server responded to CreateWindow requests by offering a framebuffer
> object using a texture as the renderbuffer then the drawing could go
> directly into the texture with no intermediate pixmap. SwapBuffer
> requests on the GLXWindow drawable could move the rendering into a
> different texture. This would probably provide the fastest possible
> implementation of texture based compositing available on existing
> hardware, and it could be done using redirected GLX in combination with
> existing GLX/GL drivers. I am not convinced that patching the drivers
> will achieve the same speeds using texture_from_pixmap.

I thought about that route too.  Afterall, I'm in the framebuffer object
working group in the ARB. :)  The difficulty with going this route is
that it adds complexity to the compositor and the server-side GLX
implementation.  Since we need to have server-side accelerated rendering
to pixmaps anyway, I don't think going the framebuffer object route
really buys us anything.

> Sharing Pixmaps between the applications doing their drawing and the
> compositor drawing the screen can not be trivial when the applications
> and compositor occupy separate processes. Telling the compositor what is
> going on after the event using Damage reports works to a point, but the
> GLX specs are deliberately vague about what happens when you share
> drawables between uncoordinated processes. Perfectly compliant
> implementations are free to respond in undefined ways, and may do so in
> order to milk a bit more speed. 

The spec uses a lot of works like "implementation defined".  We're
defining our implementation. :)

> For example, the current nvidia drivers crash when GLXPixmaps are used
> with Pixmaps that COMPOSITE has already discarded following the
> unmap/resize of a window. nvidia have undertaken to fix this, but I do
> not think that it is strictly a bug. The nvidia drivers also seem to
> freeze GL applications when a compositor tries to access the COMPOSITE
> Pixmaps directly using GL. Again, this may be a bug that they can fix,
> but it may have a performance cost.
> 
> We seem set on the hope that every single GL driver developer will be
> able to solve these kinds of problems on behalf of the compositor,
> without losing performance or expecting any thanks. I think that it
> might be more sensible and efficient to design the compositor to do
> things using GLX spec compliant techniques, rather than leaving it
> largely to others to try to fix things up behind the scenes in their
> drivers.

I think you're reading too much into what the GLX spec says.  I don't
think we're doing or trying to do anything that the spec says is
illegal.  We are straying into some areas where the spec says the
behavior is undefined or platform specific.  We're defining the way that
we want our platform to work.  I can assure that Apple and Microsoft
both do the same thing on their platforms.

>>  The two big issues with COMPOSITE today is that
>>accelerated OpenGL (direct or indirect) and Xv doesn't respect this
>>and render directly to the front buffer.
> 
> The latest nvidia drivers do perform accelerated GL drawing into off
> screen pixmaps when compositing is enabled. However, this is slower than
> doing it directly onto the screen back buffer. Xv renders directly onto
> a screen overlay because this is the only way to render quality video at
> acceptable framerates.

Up until the existance of the compositor, rendering to pixmaps was a
very, *very* uncommon usage.  I suspect that the issue is that Nvidia
has spent zero time optimizing this uncommon path.  The other potential
issue that I see is that the compositor necessarilly adds a certain
amount of latency.  That's just nature of that particular beast.

>>  Extra protocol or API won't
>>solve this, we "just" need to fix the implementation to consistently
>>redirect all kinds of rendering for a redirected window.
> 
> Forcing all direct OpenGL and Xv rendering into Pixmaps might make them
> obey the dictates of the current COMPOSITE regime, but it might also
> defeat their entire purpose - speed.
> 
> GLX redirection would enable the COMPOSITE regime to loosen its shackles
> so that maximum speeds could still be obtained. There might be other
> ways of doing this that are more immediately palatable, but I think that
> we need to address the problem of allowing existing software and
> hardware to work at the speeds that they currently enjoy.

Again, at a high level, there isn't a lot of difference between
accelerated rendering to a pixmap, a pbuffer, or an FBO.

>>There is one case where we will need extra protocol (this will be an
>>extension to the XF86DRI protocol): when doing direct OpenGL rendering
>>to a redirected window.  In this case all the rendering happens in the
>>client without the X server knowing, but the compositing manager will
>>need damage events for the pixmap in order to be able to recomposite
>>it as changes happen.  For this, we need an XF86DRI request that
>>allows libGL (the direct rendering client) to post damage on a window.
>> But note, none of this will be visible in the API.
> 
> I would imagine that DRI-compatible libGL will use this DRI request to
> inform the server of damage while processing a SwapBuffer call. nvidia
> appear to have followed a similar approach in their drivers.
> 
> GLX redirection would only be visible within the compositor, leaving the
> API used by ordinary applications unchanged. The compositor will have to
> be aware that some applications want to draw directly into the screen
> framebuffer in order to allow them to do so, and I think that an API
> change is needed to achieve this aim. 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEwPZtX1gOwKyEAw8RAqPZAJwPQaVd3mZOunqhIQz8aGTfGDviFwCdEAxU
vCuH8RoCVwfpcf2gbIvv0ME=
=6nWj
-----END PGP SIGNATURE-----