[Xcb] [PATCH xcb] Fix hanging issue in _XReply

Uli Schlachter psychon at znc.in
Sat Feb 17 11:25:19 UTC 2018


[Keeping CC list, sorry for this ending up in the moderation queue of
xorg-devel]

On 14.02.2018 22:38, Adam Jackson wrote:
> On Fri, 2018-01-05 at 16:49 +0200, Ian Ray wrote:
>> From: Tatu Frisk <tatu.frisk at ge.com>
>>
>> Assume event queue is empty if another thread is blocking waiting for event.
>>
>> If one thread was blocking waiting for an event and another thread sent a
>> reply to the X server, both threads got blocked until an event was
>> received.
>>
>> Signed-off-by: Tatu Frisk <tatu.frisk at ge.com>
>> Signed-off-by: Jose Alarcon <jose.alarcon at ge.com>
>> Signed-off-by: Ian Ray <ian.ray at ge.com>
> 
> I came across this patch again today. It looks correct to me but I
> don't claim to fully understand libX11; it would be nice if an xcb
> expert could take a look.

I would say that anyone who claims to fully understand libX11 must be
wrong. ;-)

>>  src/xcb_io.c | 18 +++++++-----------
>>  1 file changed, 7 insertions(+), 11 deletions(-)
>>
>> diff --git a/src/xcb_io.c b/src/xcb_io.c
>> index 649c820..4a8bb27 100644
>> --- a/src/xcb_io.c
>> +++ b/src/xcb_io.c
>> @@ -607,18 +607,14 @@ Status _XReply(Display *dpy, xReply *rep, int extra, Bool discard)
>>  		if(dpy->xcb->event_owner == XlibOwnsEventQueue)
>>  		{
>>  			xcb_generic_reply_t *event;
>> -			/* If some thread is already waiting for events,
>> -			 * it will get the first one. That thread must
>> -			 * process that event before we can continue. */
>> -			/* FIXME: That event might be after this reply,
>> -			 * and might never even come--or there might be
>> -			 * multiple threads trying to get events. */
>> -			while(dpy->xcb->event_waiter)
>> -			{ /* need braces around ConditionWait */
>> -				ConditionWait(dpy, dpy->xcb->event_notify);
>> +
>> +			/* Assume event queue is empty if another thread is blocking
>> +			 * waiting for event. */
>> +			if(!dpy->xcb->event_waiter)
>> +			{
>> +				while((event = poll_for_response(dpy)))
>> +					handle_response(dpy, event, True);
>>  			}
>> -			while((event = poll_for_event(dpy)))
>> -				handle_response(dpy, event, True);
>>  		}
>>  
>>  		req->reply_waiter = 0;

I would also say that this is more of a libX11 question and an xcb one.
Anyway...

So, this patch assumes that dpy->xcb_event_waiter != 0 means that there
cannot be any events in XCB's(?) event queue. When is event_waiter set?
The only place I can find is in libX11:src/xcb_io.c, function
_XReadEvents(), currently lines 382 to 402 which is (simplified pseudo
code):

if(!dpy->xcb->next_event)
{
    event_waiter = 1;
    UnlockDisplay(dpy);
    xcb_wait_for_event();
    InternalLockDisplay(dpy, 1);
    event_waiter = 0;
    dpy->xcb->next_event = 0;
}

This code clearly does not do anything to empty the event queue. Let's
just assume that there is an event sitting in xcb's event queue, then
event_waiter = 1 is set, the display is unlocked and other threads are
free to observe this even though an event is waiting.

So to me, this patch looks wrong.

To produce a correct patch, I have to wonder: What is *the* event queue?
Is it events queued in xcb, or also events queued for receiving on the
socket?

If this really just means events queued in libxcb, then _XReadEvents()
has to check xcb_poll_for_queued_event() *without* setting event_waiter
or unlocking the display. Only if this returns NULL can it do what it
currently does. This NULL clearly means that (at some point) the event
queue was empty.

If the socket has to be polled, the right function to use is
xcb_poll_for_event() and the same argumentation as above applies.

However, I am not sure that the above is correct, since
there is still a race when some event arrives: If another thread
currently has the display locked, it could see the assumption/invariant
"all events that the X11 server sent before it sent the reply that I am
handling right now were already handled" violated.

I do not know enough about libX11 to know the precise wording of this
invariant, nor do I know why it is needed.

Another thought: The problem seems to be that libX11 wants to handle all
"packets" in on-the-wire order, but libxcb does not directly offer a way
to do this. So, sometimes it has to call xcb_wait_for_event() even
though it cannot be sure that one arrives (right?). If so, couldn't
poll()ing on the socket be used to wait until "something" happens and
then xcb_poll_for_event() / xcb_poll_for_reply() can be used to figure
out what happened?
I'm not sure and I do not really know libX11.

Cheers,
Uli
-- 
A learning experience is one of those things that say,
'You know that thing you just did? Don't do that.'
                     -- Douglas Adams


More information about the xorg-devel mailing list