[RFC] DRI2 synchronization and swap bits

Mon Nov 2 08:35:35 PST 2009

On Sun, 1 Nov 2009 21:46:45 +0100
Mario Kleiner <mario.kleiner at tuebingen.mpg.de> wrote:
> I read this RFC and i'm very excited about the prospect of having  
> well working support for the OML_sync_control extension in DRI2 on  
> Linux/X11. I was hoping for this to happen since years, so a big  
> thank you in advance! This is why i hope to provide some input from  
> the perspective of future "power-users" of functions like  
> glXGetSyncValuesOML(), glXSwapBuffersMscOML(), glXWaitForSbcOML. I'm  
> the co-developer of a popular free-software toolkit (Psychtoolbox)  
> that is used mostly in the neuroscience / cognitive science
> community by scientist to find out how the different senses (visual,
> auditory, haptic, ...) work and how they work together. Our
> requirements to graphics are often much more demanding than what a
> videogame, typical vr-environment or a mediaplayer has.

Thanks a lot for taking time to go through this stuff, it's exactly the
kind of feedback I was hoping for.

> Our users often have very strict requirements for scheduling frame- 
> accurate and tear-free visual stimulus display, synchronizing  
> bufferswaps across display-heads, and low-latency returns from swap- 
> completion. Often they need swap-completion timestamps which are  
> available with the shortest possible delay after a successfull swap  
> and accurately tied to the vblank at which scanout of a swapped
> frame started. The need for timestamps with sub-millisecond accuracy
> is not uncommon. Therefore, well working OML_sync_control support
> would be basically a dream come true and a very compelling feature
> for Linux as a platform for cognitive science.

Doing the wakeups within a millisecond should definitely be possible,
I don't expect the context switch between display server and client
would be *that* high of a cost (but as I said I'll benchmark).

> 2. On the CompositePage in the DRM Wiki, there is this comment:  
> "...It seems that composited apps should never need to know about  
> real world screen vblank issues, ... ....When dealing with a  
> redirected window it seems it would be acceptable to come up with an  
> entirely fake number for all existing extensions that care about  
> vblanks.."
> 
> I don't like this idea about entirely fake numbers and like to vote  
> for a solution that is as close as possible to the non-redirected  
> case. Most of our applications run in non-redirected, full-screen,  
> undecorated, page-flipped windows, ie., without a compositor being  
> involved. I can think of a couple future usage cases though where  
> reasonably well working redirected/composited windows would be very  
> useful for us, but only if we get meaningful timestamps and vblank  
> counts that are tied to the actual display onset.

The raw numbers will always be exposed to the compositor and probably
to applications via an opt-out mechanism (to be defined still, we don't
even have the extra compositor protocol defined).

> 3. The Wiki also mentions "The direct rendered cases outlined in the  
> implementation notes above are complete, but there's a bug in the  
> async glXSwapBuffers that sometimes causes clients to hang after  
> swapping rather than continue." Looking through the code of <http:// 
> cgit.freedesktop.org/~jbarnes/xf86-video-intel/tree/src/i830_dri.c? 
> id=a0e2e624c47516273fa3d260b86d8c293e2519e4> i can see that in  
> I830DRI2SetupSwap() and I830DRI2SetupWaitMSC(), in the "if (divisor  
> == 0) { ...}" path, the functions return after DRM_VBLANK_EVENT  
> submission without assigning *event_frame = vbl.reply.sequence;
> This looks problematic to me, as the xserver is later submitting  
> event_frame in the call to DRI2AddFrameEvent() inside DRI2SwapBuffers 
> () as a cookie to find the right events for clients to wait on?
> Could this be a reason for clients hanging after swap? I found a few
> other spots where i other misunderstood something or there are small
> bugs. What is the appropriate way to report these?

This list is fine, thanks for checking it out.  I'll fix that up.

> 4. According to spec, the different OML_sync_control functions do  
> return a UST timestamp which is supposed to reflect the exact time
> of when the MSC last incremented, i.e., at the start of scanout of a
> new video frame. SBC and MSC are supposed to increment atomically/ 
> simultaneously at swap completion, so the UST in the (UST,SBC,MSC)  
> triplet is supposed to mark the time of transition of either MSC or  
> MSC and SBC at swap completion. This makes a lot of sense to me, it  
> is exactly the type of timestamp that our toolkit critically depends
> on.
> 
> Ideally the UST timestamp should be corrected to reflect start of  
> scanout, but a UST that is consistently taken at vblank interrupt  
> time would do as well. In the current implementation this is *not*  
> the semantic we'd get for UST timestamps.
> 
> The I830DRI2GetMSC() call uses a call to drmWaitVBlank() and its  
> returned vbl.reply.tval_sec and vbl.reply.tval_usec values for  
> computing UST.
> I830DRI2SetupSwap() and I830DRI2SetupWaitMSC() ask drmWaitVBlank()
> to drm_queue_vblank_event() vblank events. Later on, UST is computed  
> from the timestamp contained in the dequeued events.
> 
> If you look at the drm_wait_vblank() and drm_queue_vblank_event()  
> functions in the current dri_irq.c inside the linux-next tree,
> you'll expect the following undesireable behaviour:
> 
> I830DRI2GetMSC -> drmWaitVBlank -> drm_wait_vblank: Falls through  
> DRM_WAIT_ON, because the wait condition is not satisifed and calls  
> do_gettimeofday(&now) for the UST timestamp. This timestamping is
> not synchronized to the vblank at all!

Yeah, I have a patch to fix that.  We need to make the timestamp always
correspond to when the vblank interrupt event arrives.

> I830DRI2SetupSwap() or I830DRI2SetupWaitMSC() -> drmWaitVBlank ->  
> drm_wait_vblank -> drm_queue_vblank_event for a certain vblwait- 
>  >request.sequence number. If this target sequence number has not
>  >yet  
> been reached, the event gets queued and later on timestamped via  
> do_gettimeofday() in drm_handle_vblank_events(), which is called
> from the vblank irq handler --> Exactly the behaviour we want! If
> however the vblwait->request.sequence number has been reached already
> in drm_queue_vblank_event() then the routine will retire the event  
> immediately and apply a do_gettimeofday() timestamp immediately,  
> which will result in a wrong UST timestamp.

Oops, will fix that too.

> Unreliable UST timestamps would make the whole OML_sync_control  
> extension almost useless for us and probably other applications that  
> require good sync e.g, btw. video and audio streams, so i'd ask you  
> politely for improvements here.

Definitely; these are just bugs, I certainly didn't design it to behave
this way! :)

> I guess one (simple from the viewpoint of  a non-kernel hacker?) way  
> would be to always timestamp the vblank in the drm_handle_vblank()  
> routine, immediately after incrementing the vblank_count, probably  
> protecting both the timestamp acquisition and vblank increment by
> one spinlock, so both get updated atomically? Then one could maybe  
> extend  drm_vblank_count() to readout and return vblank count and  
> corresponding timestamp simultaneously under protection of the lock?  
> Or any other way to provide the timestamp together with the vblank  
> count in an atomic fashion to the calling code in  
> drm_queue_vblank_event(), drm_queue_vblank_event() and  
> drm_handle_vblank_events()?

Yep, that would work and should be a fairly easy change.

Thanks,
Jesse