[PATCH] DRI2: fixup handling of last_swap_target

Thu Mar 4 14:54:24 PST 2010

On Thu, 4 Mar 2010 23:42:06 +0100
Mario Kleiner <mario.kleiner at tuebingen.mpg.de> wrote:
> Ok Jesse, i checked them. Our patches are happy with each other as  
> far as i can see :-) -- thanks for cleaning this up.

Excellent, thanks for checking.

> I went over the whole hw/xfree86/dri2/dri2.c file again and found a  
> couple of new bugs and problems. Don't have the time for preparing  
> patches myself now, so i'll just list them, from severe to harmless:
> 
> * In DRI2CreateDrawable(), in the new last_swap_target init code, you  
> call...
> 
>   ret = (*ds->GetMSC)(pDraw, &ust, &pPriv->last_swap_target);
> 
> ... without checking for if (!ds->GetMSC) ...

Oh that could be crashy with a DDX w/o GetMSC support, will fix.

> * In DRI2CreateDrawable(), there is pPriv->swap_interval = 1;
> 
> Default swap interval is set to 1 at init, ie., sync to retrace is  
> on. I'd suggest considering a 0 swap interval as default. The  
> glXSwapIntervalSGI() call only allows to set swap_interval to > 0 due  
> to this bit from the SGI_swap_control spec: "glXSwapIntervalSGI  
> returns GLX_BAD_VALUE if parameter <interval> is less than or equal  
> to zero."
> 
> That means once you have a non-zero swap_interval, applications can't  
> opt-out of vsyn'ed bufferswaps, they can only opt-in. Iow there ain't  
> no way for an app to start without vsync.

Yeah, that's a valid issue.  Though really SGI_swap_interval should
take 0 as valid...  I think there's a Mesa extension for it?

> Related, in DRI2SwapInterval(), a check for...
> 
>      if (pPriv == NULL)
>          return BadDrawable;
> 
> ... seems to be missing. All other routines check for this, so this  
> one should probably do so as well.

Yep, good idea.

> * There seems to be some inconsistency or missing pieces in the way a  
> drawable is destroyed, but i may miss something.
> 
> In DRI2DestroyDrawable(), once the pPriv->refcount drops to zero, the  
> drawable is destroyed if there aren't any swaps pending. If swaps are  
> pending, according to the comments there, destruction is postponed  
> until DRI2SwapComplete() gets called. Makes sense.
> 
> But in DRI2SwapComplete() there isn't any logic that would check if  
> (pPriv->swapsPending == 0 && pPriv->refcount == 0) and then free the  
> drawable. Instead it checks for pPriv->refcount == 0 and if so,  
> throws an error and free's the drawable. It shouldn't throw an error  
> and it should only free it at the very end of the routine after  
> DRI2WakeClient() was called, if (pPriv->swapsPending == 0). Current  
> implementation has a swap limit of 1, so this is always true, but if  
> the allowable number of outstanding swaps would be increased at some  
> time, this would cause problems.

Oh good point; we wouldn't want to free until the last swap was
completed or we'd try to deref a bad drawable.

> Another issue is DRI2WaitMsc() and DRI2WaitSbc(). They should  
> probably complete immediately or throw an error if called while the  
> drawable is already awaiting destruction with a pPriv->refcount == 0?

Yeah, probably, will fix.

> If a wait for a certain target_msc or target_sbc is already  
> scheduled, one should probably also defer the destruction of the  
> drawable until they complete?

Or otherwise prevent a crash somehow.  I just realized there's a
potential resource starvation issue here; malicious apps could queue a
bunch of swaps or waits that won't happen for a long them, then destroy
their windows etc and eat up server memory.  Should handle that somehow
by destroying client state asap.

> Proposals:
> 
> DRI2WaitSBC and DRI2WaitMSC, maybe others as well (e.g.,  
> DRI2WaitSwap?) should probably check for pPriv->refcount == 0 and  
> return either an error or return immediately on such an "undead"  
> drawable. Error return is probably better, given that the  
> OML_sync_control spec requires an error return if no context/drawable  
> is bound at time of call, and the drawable has been kind'a destroyed  
> already if refcount == 0.
> 
> DRI2DestroyDrawable() should probably postpone destruction if (pPriv- 
>  >swapsPending > 0 || pPriv->blockedClient != NULL), ie., if any  
> waitsbc/waitmsc is already scheduled.
> 
> DRI2SwapComplete at its end, and DRI2WaitMSCComplete should probably  
> check if it is time to destroy the drawable before returning and do  
> so  if neccessary to avoid leaks.

Agree.

> Finally at a quick glance, there are more places in the code were  
> there are potential problems with CARD64 values vs. 32 bit integers.  
> The OML_sync_control spec wants CARD64 as return values for msc and  
> and as target_msc for glXWaitForMscOML and glXSwapbuffersMscOML(),  
> but the DRM in the kernel only operates with 32 bit vbl.request/ 
> reply.sequence numbers and some of the DRI2 functions pass values as  
> 32 bit signed integers. Assuming a video refresh rate of 200 Hz for  
> the fastest crt monitors i'm aware of, this could cause some overflow  
> problems after 2^31 / 200 / 3600 / 24 = 124 days of uptime - not that  
> unrealistic for a desktop machine. This will probably need fixes to  
> the DRM or some wraparound handling or range checking in the xserver  
> to make it really robust for unlucky applications that happen to get  
> launched at the 124th day of uptime.

Yeah, the DDX could probably handle that by virtualizing the target_msc
passed in by the X server.

-- 
Jesse Barnes, Intel Open Source Technology Center