[PATCH xserver v3] xwayland: handle EAGAIN on Wayland fd

Daniel Stone daniel at fooishbar.org
Thu Sep 15 10:19:20 UTC 2016


Hi,

On 15 September 2016 at 11:09, Olivier Fourdan <ofourdan at redhat.com> wrote:
>> > What this patch does is to avoid dispatching to the Wayland file
>> > descriptor until it becomes available for writing again, while at the
>> > same time continue processing X11 requests to release the deadlock.
>>
>> Could you expand on why the Wayland fd should not be dispatched while
>> you wait for it to flush?
>>
>> Is it just because dispatching it is likely to cause more Wayland
>> requests to be sent? I wonder how that holds up. Maybe it doesn't
>> matter as the premise is that the Wayland compositor is stuck.
>
> Actually, the idea is to minimize the risk of breaking further down in libwayland itself, either in copy_fds_to_connection()
> ("request could not be marshaled: can't send file descriptor") or in wl_proxy_marshal_array_constructor_versioned() ("Error sending request: Resource temporarily unavailable")

So, yes: you want to avoid sending more traffic down a socket which is
already full.

>> If all replies are immediate, it sounds like this would be a fairly
>> reliable fix after all.
>
> To be honest, I wouldn't consider this a fix, merely a workaround, not to say a ugly hack, but afaics from my testing here, it avoids the lock (even though, with vlc fullscreen, the deadlock reoccur continuously when the pointer hovers the timeline as the tooltip, a shaped window, gets mapped/unmapped continuously which causes mutter to issue a lot of those blocking rountrips) - Yet it offers a chance to get out of the deadlock (eventually...) and not lose Xwayland, which for gnome-shell/mutter means losing the entire user session.

Indeed, it is a workaround: X11 requests can still provoke more
activity. Consider where you ignore all Wayland events, but gtkperf
continues rendering as an X11 client. Flushing that rendering to the
upstream compositor (wl_surface_attach/damage/commit/etc) will cause
more traffic on the Wayland connection. You might be able to do even
better by calling
OnlyListenToOneClient(clientInfo.clients[wm_client_index]) when
transitioning wait_flush from 0 to 1, and ListenToAllClients() when it
transitions back from 1 to 0.

That being said, it's a good start, and I definitely believe you when
you say it improves things for now. I think in the long run it does
need to be extended so we avoid things like flushing more rendering,
but with EINTR checked in xwl_display_pollout:
Reviewed-by: Daniel Stone <daniels at collabora.com>

Cheers,
Daniel


More information about the xorg-devel mailing list