weird Xwayland and compositor deadlock issue [WAS: [PATCH xserver v2] xwayland: handle EAGAIN and EINTR gracefully]
Jasper St. Pierre
jstpierre at mecheye.net
Sat Sep 17 05:21:42 UTC 2016
Hi,
Based on my reading of the spec, writing an ICCCM-compliant WM *requires*
blocking, since the behavior of an UnmapNotify depends on the attributes of
a window. We cannot process any X11 events while we are retrieving the
attributes of a mapped window inside MapRequest.
If we want to modify the X11 protocol to provide non-blocking events to
provide e.g. attributes in MapRequest, values in PropertyNotify, and shapes
in ShapeNotify (the three major cases of required blocking right now), I'd
be for it.
Focus management is extremely complex and subtle. Reading back on the
history:
https://bugzilla.gnome.org/show_bug.cgi?id=701017
https://bugzilla.gnome.org/show_bug.cgi?id=720558
The first patch was overly complex -- the XChangeProperty to bump the
serial could have simply been a XNoOp to bump the serial while under server
grab. :) We could even make that cleanup now. But it would be a minor
simplification.
Daniel suggested that timestamps *should* be on the same timebase.
Currently, they are not. X11 server timestamps are
CLOCK_MONOTONIC_COARSE-based and are calculated at delivery time, evdev
timestamps are CLOCK_MONOTONIC-based and are calculated at input time. This
is why there are several focus management bugs that happen when you replace
meta_display_get_current_time_roundtrip() with a clock_gettime().
We need to fix this, otherwise we can never properly synchronize X11 event
streams and Wayland event streams. But Xorg calls GetCurrentTimeMillis()
literally everywhere and compares against that instead of using evdev's own
timestamps, and I doubt we can fix that without breaking multiple, multiple
clients.
The only thing I can think of for that is, again, the Wayland-in-X11
solution: an X11 extension that delivers the timestamp with every response
and event from the server so we don't block on a PropertyChange for it.
On Wed, Sep 14, 2016 at 12:56 AM, Pekka Paalanen <ppaalanen at gmail.com>
wrote:
> On Tue, 13 Sep 2016 12:04:14 -0400 (EDT)
> Olivier Fourdan <ofourdan at redhat.com> wrote:
>
> > Hi Pekka,
> >
> > ----- Original Message -----
> > > Hi Olivier,
> > >
> > > I don't have any solution for you. The interactions between the Wayland
> > > compositor and Xwayland are known to be very easily deadlockable IIRC.
> I
> > > believe the only thing you can do is ensure no such case can ever
> > > occur, which is very painful. That is, never do a blocking roundtrip at
> > > least from one side.
> > >
> > > Have the recent modifications caused a significant increase of Wayland
> > > requests from Xwayland? If Xwayland needs to send an amount of data
> > > bigger than bufferable, *any* blocking roundtrip via X11 from the
> > > Wayland compositor is prone to deadlock. It will be waiting for a reply
> > > via X11, while Xwayland is blocked on flushing, since the Wayland
> > > compositor is not consuming requests.
> > >
> > > It can also trivially happen if both sides do a blocking roundtrip at
> > > the same time. Or just a wait for an event.
> > >
> > > Either server needs to be able to return to its main loop to process
> the
> > > protocol stream it is the server for. Preferably both, I think.
> >
> > Unfortunately, any XSync (like, for example, called in
> > gdk_error_trap_pop() in gdk) will issue a blocking roundtrip, and
> > window managers tend to do that quite a lot (some more than others)
> > so I don't think we can easily chaneg that in window managers to
> > avoid blocking rountrips on X11 side.
> >
> > > You could check how Weston's XWM works. I highly suspect that after
> > > Xwayland launch it avoids doing any blocking roundtrips via X11.
> >
> > Yet sometimes some X calls are blocking, e.g. XShapeGetRectangles()
> > or even XGetWindowAttributes() which is invoked by mutter each time
> > the a new window is mapped. mutter still uses Xlib and not xcb.
> >
> > > I'd assume Xwayland also tries to avoid blocking on Wayland events,
> > > but if nothing else, I believe Mesa via GLAMOR may block on
> > > wl_buffer.release events... or maybe not if GLAMOR is smart with its
> > > throttling. Anyway, since your flush is hitting EAGAIN, that doesn't
> > > seem to be the cause.
> > >
> > > I wonder if making wl_display_flush() block immediately like in your
> > > patch could be replaced by adding the wl_display fd to the main poll
> > > loop, so that it would get flushed ASAP but still service X11
> > > requests in the mean time? It does run the risk of overflowing the
> > > Wayland send buffer in Xwayland. Any way to prioritize the Wayland
> > > compositor's X11 connection in Xwayland?
> >
> > If I don't make EAGAIN a FatalError() and wait for the Wayland
> > display file descriptor to become writable again, Xwayland eventually
> > dies with another error "(EE) request could not be marshaled: can't
> > send file descriptor" from libwayland directly (in
> > copy_fds_to_connection()).
>
> Hi,
>
> summarizing from #wayland irc between Olivier and Daniel: the proper
> solution is indeed to never do blocking X11 roundtrips from the Wayland
> compositor, but for practical reasons that might not be possible.
>
> The irc log starts here:
> https://people.freedesktop.org/~cbrill/dri-log/index.php?
> channel=wayland&highlight_names=&date=2016-09-13#t-1402
>
>
> Thanks,
> pq
>
> _______________________________________________
> xorg-devel at lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel
>
--
Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.x.org/archives/xorg-devel/attachments/20160916/1b2c50c7/attachment-0001.html>
More information about the xorg-devel
mailing list