[RFC] Fix attempt for Mesa + X-Server 1.20 + modesetting-ddx hangs on KDE5.

Mario Kleiner mario.kleiner.de at gmail.com
Fri May 4 13:45:40 UTC 2018

Two patches, solving the same problem in two different ways, the 1st
one ready to go, the 2nd one would need the debug statements removed.

Only apply one of those for testing, the 2nd one will be useless with
the 1st one applied, but demonstrates the problem.

So X-Server 1.20 RC + modesetting-ddx with DRI3/Present hangs at least
KDE-5's plasmashell and makes KDE-5 unusable with that setup.

As KDE's plasmashell uses QT-5's QtQuick OpenGL based rendering api's
to render scene-graphs, this bug might affect other QT applications
as well.

This fix works, but it points to some problems in modesetting-ddx's
current vblank handling, because other ddx'en seem to be mostly
unaffected by this Mesa bug.

The problem is that neither of these two fixes is a proper final
solution, but better than nothing. It leaves the OML_sync_control
extensions glXWaitForSbcOML(), glXWaitForMscOML() calls and the
SGI_video_sync glXWaitVideoSyncSGI() functions broken for some
use patterns.

The real problem, if i understand it correctly, is the way the life-time
of dri3_drawables and loader_dri3_drawables is managed atm. by Mesa's
bindContext() functions. Whenever glXMakeCurrent() etc. are called to
assign new/different GLXDrawables to the same context (ie. one context
reused for drawing into many different drawables, as opposed to using
one dedicated context for each drawable), we destroy the underlying
DRIDrawables/dri3_drawables_loader_dri3_drawables and they lose all
state wrt. pending bufferswaps, msc, sbc, ust.

Nothing in the specs says that clients should expect to lose such
state on a GLXDrawable d1 whenever they reassign drawables other than
d1 to a GL context. A sequence like...

1.glXMakeCurrent(context, drawable1);
2.draw draw draw
3.glXSwapbuffers(context, drawable1);
4.glXMakeCurrent(context, drawable2); // drawable 1 loses all state!
5.glXWaitForSbcOML(dpy, drawable1, ...);

... would probably cause a hang of the client in glXWaitForSbcOML, as
the function requires information stored in the "original" drawable1
up to step 3, but lost in step 4 due to dri3_drawable destruction.

Patch 1 has a potentially large performance impact when switching
drawables on a given context, due to the enforced wait on swap completion,
but might save OML clients which do waits for sbc,msc on a separate thread,
whereas patch 2 doesn't have a performance impact, but doesn't even
partially solve trouble with OML_sync_control.

However, i'm totally out of time atm. and probably not the right person
to think about a better solution, and by dumb luck, my own application
doesn't recycle the same context for different drawables, but uses a
dedicated context for each drawable, so it dodges this bullet.

Therefore one of these patches is either a good enough fix for the KDE
hang problems atm. or a diagnosis of the problem as a starting point for
brighter people to deal with the root cause ;-)


More information about the xorg-devel mailing list