Initial DRI3000 protocol specs available

Mon Mar 11 14:51:42 PDT 2013

On 03/07/2013 02:52 PM, James Jones wrote:
> On 03/07/2013 01:19 PM, Owen Taylor wrote:
>> On Thu, 2013-02-28 at 16:55 -0800, Keith Packard wrote:
>>
>>>> * It would be great if we could figure out a plan to get to the
>>>>    point where the exact same application code is going to work for
>>>>    proprietary and open source drivers. When you get down to the
>>>> details
>>>>    of swap this isn't close to the case currently.
>>>
>>> Agreed -- the problem here is that except for the nVidia closed drivers,
>>> everything else implicitly serializes device access through the kernel,
>>> providing a natural way to provide some defined order of
>>> operations. Failing that, I'd love to know what mechanisms *could* work
>>> with that design.
>
> Fence syncs.  Note the original fence sync + multi-buffer proposal
> solved basically the same problems you're trying to solve here, as well
> as everything Owen's WM spec updates do, but more generally, and with
> that, a little more implementation complexity.  It included proposals to
> make minor updates to GLX/EGL as well to tie them in with the newer
> model.  There didn't seem to be much interest outside of NVIDIA, so
> besides fence sync, the ideas are tabled internally ATM.
>
>> I don't think serialization is actually the big issue - although it's
>> annoying to deal with fences that are no-op for the open sources, it's
>> pretty well defined where you have to insert them, and because they are
>> no-op's for the open source drivers, there's little overhead.
>>
>> Notification is more of an issue.
>>
>>>>    - Because swap handled client side in some drivers, INTEL_swap_event
>>>>      is seen as awkward to implement.
>>>
>>> I'm not sure what could be done here, other than to have some way for
>>> the X server to get information about the swap and stuff it into the
>>> event stream, of course. It could be as simple as having the client
>>> stuff the event data to the X server itself.
>>
>> It may be that a focus on redirection makes things easier - once the
>> compositor is involved, we can't get away from X server involvement. The
>> compositor is the main case where the X server can be completely
>> bypassed when swapping. And I'm less concerned about API divergence for
>> the compositor. (Not that I *invite* it...)
>>
>>>>    - There is divergence on some basic behaviors, e.g.,  whether
>>>>      glXSwapBuffers() glFinish() waits for the swap to complete or not.
>>>
>>> glXSwapBuffers is pretty darn explicit in saying that it *does not* wait
>>> for the swap to complete, and glFinish only promises to synchronize the
>>> effects of rendering ("contents of the frame buffer"), not the actual
>>> swap operation itself. I'm not sure how we're supposed to respond when
>>> drivers ignore the spec and do their own thing?
>>
>> I wish the GLX specification was clear enough so we actually knew who
>> was ignoring the spec and doing their own thing... ;-) The GLX
>> specification describes the swap operation as the contents of the back
>> buffer "become the contents of the front buffer" ... that seems like an
>> operation on the "contents of the frame buffer".
>
> The GLX spec is plenty clear here.  It states:
>
> "Subsequent OpenGL commands can be issued immediately, but will not be
> executed until the buffer swapping has completed..."

There are two ambiguities in this one sentence. :)

1. What does "executed" mean?  Does it mean the GPU doesn't do the work 
(i.e., are things causally ordered) or does it mean the command won't 
even be queued until the buffer swapping completed.

2. What does "completed" mean?  Does it mean the pixels are visible on 
the user's monitor, or does it mean sufficient work has happened for the 
back buffer to be ready for rendering?

Unfortunately, different users want different things, and different 
implementations provide different things.

> And glFinish, besides the fact that it counts as a GL command, isn't
> defined as simply waiting until effects on the framebuffer land.  All
> rendering, client, and server (GL server, not X server) state side
> effects from previous operations must settle before it returns.
> SwapBuffers affects all three of those.  Same for fence syncs with
> condition GL_SYNC_GPU_COMMANDS_COMPLETE.

Yeah, glFinish and fences are special.

> So if the drawable swapped is current to the thread calling swap
> buffers, and they issue any other GL commands afterwards, including
> glFinish, glFenceSync, etc., those commands can't complete until after
> the swap operation does.  For glFinish, that means it can't return.  For
> fence, the fence won't trigger until the swap finishes.  If
> implementations aren't behaving that way, it's a bug in the
> implementation.  Not to say our implementation doesn't have bugs, but
> AFAIK, we don't have that one.
 >
> Thanks,
> -James
>
>> But getting into the details here is a bit of a distraction - my goal is
>> to try to get us to convergence so we have only one API with well
>> defined behaviors.
>>
>>>>    - When rendering with a compositor, the X server is innocent of
>>>>      relevant information about timing and when the application should
>>>>      draw additional new frames. I've been working on handing this
>>>>      via client <=> compositor protocols
>>>
>>> With 'Swap', I think the X server should be involved as it is necessary
>>> to get be able to 'idle' buffers which aren't in use after the
>>> compositor is done with them. I tried to outline a sketch of how that
>>> would work before.
>>>
>>>> (https://mail.gnome.org/archives/wm-spec-list/2013-January/msg00000.html)
>>>>
>>>>
>>>>      But this adds a lot of complexity to the minimal client,
>>>> especially
>>>>      when a client wants to work both redirected and unredirected.
>>>
>>> Right, which is why I think fixing the X server to help here would be
>>> better.
>>
>> If the goal is really to obsolete the proposed WM spec changes, rather
>> than just make existing GLX apps work better, then there's quite a bit
>> of stuff to get right. For example, from my perspective, the
>> OML_sync_control defined UST timestamps are completely insufficient -
>> it's not even defined what the units are for these timestamps!
>>
>>>>    I think it would be great if we could sit down and figure out what
>>>>    the Linux-ecosystem API is for this in a way we could give to
>>>>    application authors.
>>>
>>> Ideally, a GL application using simple GLX or EGL APIs would work
>>> 'perfectly', without the need to use additional X-specific APIs. My hope
>>> with splitting DRI3000 into separate DRI3 and Swap extensions is to
>>> provide those same semantics to simple double-buffered 2D applications
>>> using core X and Render drawing as well, without requiring that they be
>>> rewritten to use GL, and while providing all of the same functionality
>>> over the network as local direct rendering applications get today.
>>
>> The GLX APIs have some significant holes and poorly defined aspects. And
>> they don't properly take compositing into account, which is the norm
>> today. So providing those capabilities to 2D apps seems of limited
>> utility.
>>
>> [...]
>>
>>>>    The SwapComplete event is specified as - "This event is delivered
>>>>    when a SwapRegion operation completes" - but the specification
>>>>    of SwapRegion itself is fuzzy enough that I'm unclear exactly what
>>>>    that means.
>>>>
>>>>    - The description SwapRegion needs to define "swap" since the
>>>>      operation has only a vague resemblance to the English-language
>>>>      meaning of "swap".
>>>
>>> Right, SwapRegion can either be a copy operation or an actual swap. The
>>> returned information about idle buffers tells the client what they
>>> contain, so I think the only confusion here is over the name of the
>>> request?
>>
>> The confusion to me is that we all have some idea of what a "swap" is,
>> and what "complete" means, but when we try to nail things down, the
>> details are not so clear. I'd rather we were precise about the meaning
>> than try to leave wriggle room for future stuff.
>>
>>   * What do you you get if you CopyArea/glReadPixels/draw with TFP from
>>     various targets.
>>
>>   * What is scanned out from the front buffer to the output device?
>>
>> Swap should be defined in terms of these basic concepts.
>>
>> [...]
>>
>>>>    - What happens when multiple SwapRegion requests are made with a
>>>>      swap-interval of zero. Are previous ones discarded?
>>>
>>> Any time a SwapRegion request is made with one still pending, the server
>>> may choose to skip the first contents and swap directly to the second
>>> contents. I'm not sure how this would be visible to the application
>>> though?
>>
>> The application can tell by looking at events. My question was also
>> inspired by thinking about the question of what would happen in the
>> redirected case if the client and the compositor had side-band protocols
>> to control the rate of presentation.
>>
>> [...]
>>
>>>>    - What's the interaction between swap-interval and target-msc, etc?
>>>
>>> I'm afraid I just copied these from the DRI2 spec without really
>>> understanding the precise semantics. They originally came from the
>>> related GL specs.
>>
>> DRI2 combined together concepts from several overlapping specs. As long
>> as it was an implementation detail, the question of what happens when
>> things overlap wasn't a big issue. If we make Swap app-exposed, then a
>> hand-wave isn't sufficient.
>>
>>>>    - When a window is redirected, what's the interpretation of
>>>>      swap-interval, target-msc, etc? Is it that the server performs the
>>>>      operation at the selected blanking interval (as if they window
>>>>      wasn't redirected), and then damage/other events are generated
>>>>      and the server picks it up and renders to the real front buffer
>>>>      at the next opportunity - usually a frame later.
>>>
>>> This sends us down a very deep hole, and one which I intend to resolve
>>> at some point, but for now, I'd love to focus on getting the semantics
>>> for non-redirected windows looking sane, and then try to figure out how
>>> to replicate those semantics in a redirected world.
>>
>> We have a set of semantics for GLX that we need to keep working, since
>> there are going to be piles of old GLX applications. But once we move
>> beyond that, then redirection is the normal case.
>>
>> [...]
>>
>>>> * In the definition of SWAPIDLE you say:
>>>>
>>>>      If valid is TRUE, swap-hi/swap-lo form a 64-bit
>>>>      swap count value from the SwapRegion request which matches the
>>>>      data that the pixmap currently contains
>>>>
>>>>   If I'm not misunderstanding things, this is a confusing statement
>>>>   because, leaving aside damage to the front buffer, pixmaps always
>>>>   contain the same contents (whatever the client rendered into it.)
>>>
>>> No, the Swap operation may actually *replace* the pixmap contents with
>>> the other buffer contents. That allows for efficient pointer swapping
>>> instead of actual data copying. This number lets the client know what
>>> the pixmap holds as a result of this operation, which may simply be the
>>> previous pixmap contents or may be the contents from one or more frames
>>> previous.
>>
>> So the association of pixmap ID to buffer can change as the result of a
>> swap operation? What's the motivation for this? - it seems to me that
>> once we start labeling buffers with pixmap ID's, it would be simpler to
>> keep the association - it wouldn't hinder the server from implementing a
>> swap as either a copy or a exchange.
>>
>>>> * What control, if any, will applications have over the number of
>>>>    buffers used - what the behavior will be when an application starts
>>>>    rendering another frame in terms of allocating a new buffer versus
>>>>    swapping?
>>>
>>> The application is entirely in charge of allocating buffers; the server
>>> never allocates anything. As such, the application may well choose to
>>> pause until buffers go idle before continuing to render so as to limit
>>> buffer use to a sane amount.
>>
>> The question was really about whether we saw exposing such a control to
>> apps rendering via GLX.
>>
>>> Switching to events for Idle notification should make this a lot logical
>>> and perhaps easier to understand, although the client implementation
>>> will be a pain.
>>>
>>>> * Do we need to deal with stereo as part of this?
>>>
>>> Probably? But I'm not sure how?
>>
>> One thing I'll point out here is that the texture_from_pixmap already
>> has some support for stereo written into it. As it is written the pixmap
>> ID represents *all* the buffers for the window- left, right, and aux.
>>
>> This poses some issues for viewing the pixmap as a "normal pixmap" in X
>> terms, but otherwise simplifies stereo to a question of buffer format.
>> An issue for the DRI3 extension rather than the Swap extension.
>>
>> - Owen
> _______________________________________________
> xorg-devel at lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: http://lists.x.org/mailman/listinfo/xorg-devel
>