[PATCH] Cannot wait for user lock in sync_while_locked

Fri Mar 9 16:31:14 PST 2012

<#part sign=pgpmime>
On Sun, 04 Mar 2012 02:04:27 -0800, Keith Packard <keithp at keithp.com> wrote:

> diff --git a/src/locking.c b/src/locking.c
> index 4f9a40f..b3dfb3b 100644
> --- a/src/locking.c
> +++ b/src/locking.c
> @@ -486,6 +486,8 @@ static void _XInternalLockDisplay(
>  static void _XUserLockDisplay(
>      register Display* dpy)
>  {
> +    _XDisplayLockWait(dpy);
> +
>      if (++dpy->lock->locking_level == 1) {
>         dpy->lock->lock_wait = _XDisplayLockWait;
>         dpy->lock->locking_thread = xthread_self();

I made Bart think about this over lunch today and I think this is
the right fix for this problem.

Here's a lengthy discussion of the code. It took me a long time to
study the code, and I'm sorry that it will take you a long time to just
read this message...

Recall that the problem occurs when a thread acquires the Display mutex
without waiting for locking_level to go to zero. If it then enters
XUserLockDisplay, it will increment locking_level, at which point we
effectively have two separate threads "owning" the 'user lock' and chaos
ensues.

The 'user lock' is a recursive mutex which is implemented with the
following two values:

        1) locking_level
                This counts the number of acquisitions of the lock, the
                lock is held until the count returns to zero.

        2) locking_thread
                This marks which thread took the lock the first time. This
                thread will not block in _XDisplayLockWait, which is how
                the mutex can recurse.

The implementation is a bit weird -- the usual blocking part of mutex
acquisition is separated from the increment of the count. That's because
of the weird semantics of XLockDisplay/XUnlockDisplay. Those functions
implement a mutex that affects *all* threads using Xlib, not just those
sharing in the XLockDisplay/XUnlockDisplay fun.

When a thread calls XLockDisplay, every other thread using Xlib is
locked out of some Xlib activities. This is done by blocking every other
thread in LockDisplay while the locking_level remains positive, while
letting the thread holding the lock through by comparing its thread-id
with locking_thread. Simple, and allows for recursive locks as
locking_level can be incremented more than once.

However, it doesn't allow for more than one thread to hold the 'user
lock'. The current implementation of _XUserLockDisplay must therefore
assume that any thread calling it has already ensured that locking_level
is zero, or that the calling thread is the locking_thread. This is
always true after _XLockDisplay is invoked -- that always calls
_XDisplayLockWait to block execution until locking_level is zero. It is
*not* always true after _XInternalLockDisplay is called; when 'wskip' is
non-zero, locking_level may be non-zero, and locking_thread may well be
another thread.

*************** Obvious Fix #1: ********************

An obvious fix would seem to be to just make _XInternalLockDisplay
always wait for a thread holding the 'user lock' to release it (forcing
'wskip' to zero). That doesn't work in one important case -- in
_XReadEvents. In that function, an event is read from xcb, then the
mutex re-acquired and the event posted to the display. Only one thread
is allowed to wait in xcb for the event. Other event readers will pend
on the 'event_notify' condition variable. Once the event is read from
xcb, the mutex is re-acquired, the event posted to Xlib and threads
waiting on the condition variable awoken with a broadcast.

Consider the sequence:

        Thread A                        Thread B
        XNextEvent
          LockDisplay
          _XReadEvents
            UnlockDisplay
            xcb_wait_for_event
                                        XLockDisplay
                                          LockDisplay
                                          _XUserLockDisplay    (locking level 1)
                                          UnlockDisplay
                                        XNextEvent
                                          LockDisplay
                                          _XReadEvents
                                            ConditionWait
                                              UnlockDisplay
            InternalLockDisplay                                (waiting for locking_level to go to zero?)

If the 'InternalLockDisplay' call were to wait for the 'user lock', it
would never wake up as 'Thread B' holds that lock and is blocked,
waiting for 'Thread A' to signal the condition variable.

**************** Obvious Fix #2: ****************

The second 'obvious' fix that occurred to me was to *always* hold the
'user lock' across calls to xcb; this would prevent another thread from
acquiring the 'user lock' while off in xcb, converting the above into:

        Thread A                        Thread B
        XNextEvent
          LockDisplay
          _XReadEvents
            _XUserLockDisplay                                   (locking level 1)
            UnlockDisplay
            xcb_wait_for_event
                                        XLockDisplay
                                          LockDisplay
                                            ConditionWait       (wait for locking_level to go to zero)
            InternalLockDisplay
            _XUserUnlockDisplay                                 (locking level goes to zero)
          UnlockDisplay
                                          _XUserLockDisplay     (bump locking_level to 1)
                                          UnlockDisplay
                                        XNextEvent
                                          _XReadEvents
                                            xcb_wait_for_event

This would be 'work', but it would prevent *any* Xlib function from
being called while a thread was blocked in xcb_wait_for_event, which
doesn't exactly make multi-threaded applications practical.

*************** Provided fix: ******************

The fix provided in the above patch prevents multiple threads from
acquiring the 'user lock' at the same time, by blocking threads in
_XUserLockDisplay until the lock has been released.

*************** Further notes on _XReadEvents: ********************

There is a comment in _XReadEvent that mentions that Xlib respected the
'user lock' when waking up after reading events. As noted above, we must not
block on the 'user lock' before queuing the event and broadcasting to
event_notify. However, there's no reason we cannot block after that. I
propose the following change:

commit 12df097be2283873a1f17af084803788506c97d8
Author: Keith Packard <keithp at keithp.com>
Date:   Fri Mar 9 16:24:55 2012 -0800

    Restore Xlib semantics of blocking on user lock from _XReadEvents
    
    This ensures that XLockDisplay/XUnlockDisplay prevents any thread from
    receiving X events, as Xlib did.
    
    Signed-off-by: Keith Packard <keithp at keithp.com>

diff --git a/src/xcb_io.c b/src/xcb_io.c
index 0af47d8..7b5debc 100644
--- a/src/xcb_io.c
+++ b/src/xcb_io.c
@@ -412,6 +412,13 @@ void _XReadEvents(Display *dpy)
                        if(!event)
                                _XIOError(dpy);
                        dpy->xcb->next_event = event;
+
+                       /* Now that the event has been queued in Xlib, wait
+                        * for any user lock and start over */
+                       if (dpy->lock->lock_wait) {
+                           (*dpy->lock->lock_wait) (dpy);
+                           continue;
+                       }
                }
 
                /* We've established most of the conditions for

************** Summary ****************

 1) We must prevent multiple threads from holding the 'user lock'.
    ('user lock' is a simple counting lock with only one thread id)

 2) We cannot unconditionally wait for the 'user lock' in
    InternalLockDisplay
    (XNextEvent will deadlock)

 3) We cannot hold the 'user lock' across blocking xcb operations
    (Xlib wants to permit parallel event reading and request submission)

Given these requirements, it seems like the correct solution is to block
before acquiring the 'user lock' until other locking threads are done
with it. This imposes the minimum blocking necessary to ensure that the
'user lock' is only ever held by a single thread.

-- 
keith.packard at intel.com