[PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

Tue Dec 16 10:30:04 PST 2014

On 12/16/2014 09:23 AM, Keith Packard wrote:
> Mario Kleiner <mario.kleiner.de at gmail.com> writes:
>
>> The 0 case is good for benchmarking.
> Sure, but the current code does benchmarking just fine. In fact, because
> it doesn't copy queued frames that aren't the most recent before the
> vblank, benchmarks tend to run *faster* as a result, and people
> generally like that aspect of it...
>

Hmm. For benchmarking i think i'd consider that a mild form of cheating. 
You get higher fps because you skip processing like the whole gpu blit 
overhead and host processing overhead for queuing / validating / 
processing the copy command in the command stream, so the benchmark 
numbers don't translate very well anymore in how the system would behave 
in a non-benchmark situation?

... but read on below ...

>> In my specific case i always want vsync'ed swap for actual visual
>> stimulation in neuroscience/medical settings, with no frame skipped
>> ever. The bonus use for me, except for benchmarking how fast the system
>> can go, is if one has a multi-display setup, e.g., dual-display for
>> stereoscopic stimulation - one display per eye, or some CAVE like setup
>> for VR with more than 2 displays. You want display updates and scanout
>> on all of them synchronized, so the scene stays coherent. One simple way
>> for visually testing multi-display sync is to intentionally swap all of
>> them without vsync, e.g., timed to swap in the middle of the scanout. If
>> the tear-lines on all displays are roughly at the same vertical position
>> and stay there then that's a good visual test if stuff works. There are
>> other ways to do it, but this is the one method that seems to work
>> cross-platform, without lots of mental context switching depending on
>> what os/gpu/server/driver combo with what settings one uses, and much
>> more easy to grasp for scientists with no graphics background. You can
>> see at a glance if stuff is roughly correct or not.
> It seems like you want something that the GL API doesn't express
> precisely; my reading of the  GL spec definitely lets Present work the
> way it does today, and as you avoid tearing *and* improve performance in
> the vblank_mode=0 case, I'm very reluctant to change it.

 From GLX_EXT_swap_control and MESA_swap_control:

"If <interval> is set to a value of 0, buffer swaps are not
     synchronized to a video frame."

It depends on how you interpret the "not synchronized to a video frame"? 
Can you explain your interpretation?

I don't think the spec says anywhere that dropping old "not most recent 
at vblank time" frames is allowed, like Present does atm.? And the 
current Present implementation does synchronize the "buffer swap" of its 
most recent received Pixmap to the [onset of] a video frame, so while 
preventing tearing it goes a bit slower than it could go without vsync.

Every past/other implementation than DRI3/Present that i have experience 
with interpreted the spec the way that patch tries to restore, at least 
everything i know of from OSX/Windows/Linux proprietary/DRI2, so my 
interpretation is certainly a valid interpretation, and it is the one 
that provides consistency and therefore the least surprise to 
implementers of GL clients and end users.

> Present could trivially offer a new bit to force tearing; I'm not sure
> how you'd get at that from GL though.
>

It does already with PresentOptionAsync? It just needs to be used in 
accordance with the mainstream interpretation of the _swap_control spec, 
like this patch suggests.

I'm not trying to claim here that the current behaviour of Mesa+Present 
isn't useful for some types of applications like games. I'm just saying 
it shouldn't be the default behaviour for swapinterval 0 or > 0. As far 
as i understand the meaning, intention and origin of the 
EXT_swap_control_tear extension, the current Present implementation 
would implement a useful approximation of EXT_swap_control_tear for a 
swapinterval of < 0. Not an exact implementation, but at least following 
the spirit of that extension.

So i'm arguing for restoring the default behaviour any other 
implementation has with that patch, but providing the current behaviour 
via sync_control_tear? Or maybe even some new sync_control_tear2 to 
cover the difference between the current method and sync_control_tear.

When we are at the topic, i can also send you my christmas wish list 
with proposals for future mesa/server releases:

1 - Another thing i'd love to have, which would require a new option 
"PresentOptionDontSkip" is the ability to not skip present requests 
which are late. That would allow to take advantage of mesas triple/quad 
buffering to queue frames for animations ahead of time for playing 
animations or videos and be still certain that every queued frame was 
shown at least for one video refresh cycle. I'd love to take advantage 
of the new triple-buffering behaviour, or maybe even use Present 
directly somehow for deeper n-buffering, but for some of my types of 
application i'd need to be certain that frames are not ever skipped if 
something gets late. As things are now, i'm forced to wait for swap 
completion of each bufferswap before i can submit a new swapbuffers 
request to make sure Present will never drop rendered frames, so i have 
to enforce the constraints of double-buffering onto my application for 
correctness although it could make use of n-buffering.

This would also make sense for an improved OML_sync_control 
implementation. That spec requires that no swap request is ever dropped, 
citing:

"If there are multiple outstanding swaps for the same window, at most 
one such swap can be satisfied per increment of MSC. The order of 
satisfying outstanding swaps of a window must be the order they were 
issued. Each window that has an outstanding swap satisfied by the same 
current MSC should have one swap done."

Getting this behaviour is difficult or impossible without some 
PresentOptionDontSkip.

2 - Some extension to INTEL_swap_events to be able to signal if a 
present request was skipped, so i can find out for any specific "sbc" if 
its rendering reached the eyes of my end users or was silently discarded.

More later, will be away from the keyboard for a couple of hours,
-mario