X input event generation thread (v2)

Mark Kettenis mark.kettenis at xs4all.nl
Wed Oct 13 00:27:14 PDT 2010


> From: Adam Jackson <ajax at nwnk.net>
> Date: Wed, 29 Sep 2010 11:35:53 -0400
> 
> On Tue, 2010-09-28 at 23:00 +0200, Mark Kettenis wrote:
> 
> > If the input thread is going to run the same code as the SIGIO
> > handlers do now, I fear this isn't going to fly.  That code simply
> > isn't thread-safe.  The biggest issue the code that draws pointers.
> > On multi-card setups, that will cause the input thread to switch the
> > VGA arbiter to the device on which the pointer is visible.  If that
> > happens while the main server thread is drawing on a different device
> > bad things will happen (typically things just lock up).
> 
> As we said at XDS, this is merely a bug.  The sprite code is written to
> expect this case and get it right (by deferring sprite updates if they
> would trigger a screen crossing), so if it's broken that's something we
> introduced and need to fix.

I think it does illustrate though that getting things right is
difficult.  It isn't the first SIGIO-related bug I've seen.

> > Of course the same problem exists with SIGIO.  After realizing how
> > much code was run from the SIGIO signal handler, and the VGA arbiter
> > issues related to that we decided to disable SIGIO on OpenBSD
> > (encouraged by Alan C.'s statement that Solaris does the same).  As
> > far as I can tell, the only effect of this is that it disables the
> > "silken mouse".  Quite a few OpenBSD users tested that change, and all
> > of them indicated that they noticed no difference whatsoever.
> 
> Your users are clearly not sensitive to input latency.  Mine are.

There's probably a selection effetc here.  OpenBSD users tend to
prefer using tiling windows managers over "3D" desktops.

> But, numbers.  The current perceptual behaviour of the SIGIO code is
> that, in the common non-screen-crossing case, the cursor is perfectly
> stuck to the screen; updates happen on the very next vertical retrace.
> If you wanted to preserve that behaviour, you have options.
> 
> For example: After every request, select() for input again and process
> it if there is any.  That's certainly something you can do.  Here's how
> that looks to your request throughput numbers:
> 
> 1: Xvfb-normal.perf
> 2: Xvfb-select-happy.perf
> 
>     1              2           Operation
> --------   -----------------   -----------------
> 542000.0   535000.0 (  0.99)   100x100 rectangle=20
> 268000.0   258000.0 (  0.96)   ShmPutImage 100x100 square=20
> 14700000.0   7740000.0 (  0.53)   X protocol NoOperation=20
>  19400.0    16500.0 (  0.85)   QueryPointer=20
> 574000.0   554000.0 (  0.97)   Create and map subwindows (4 kids)=20
> 49900000.0   40100000.0 (  0.80)   Dot=20
> 
> So, anything that's round-trip-limited gets 15% slower, anything that's
> request-rate-limited gets 20% to 50% slower, but anything that's
> actually bounded by server execution time is pretty much unaffected.
> Not the end of the world, but not something I'd ship.

Matthieu Herrb seemed to suggest that the effect of the select(2)
overhead is somewhat dependent on the OS and that it manifests itself
more on Linux than *BSD.  But indeed, this is a significant effect.
Thanks for the effort of running those benchmarks.

> But, of course, you don't really need to do that.  You can do something
> like what the smart scheduler does, possibly even still _using_ SIGIO to
> do it: if input comes in, raise a flag (isItTimeToYield, in fact) to
> bomb back out from request processing to the select loop.

That approach makes sense to me.  Ultimately doing much more than just
setting a flag in a signal handler will get you into big trouble real
soon.  Traditionally on UNIX you're not even supposed to do
floating-point arithmetic (the "hilarious" signal handler bugs that
got mentioned some time ago aren't *that* hilarious).  The current
SIGIO handler scares the hell out of me.  It is impossible to
determine the bits of code that can be run by it, especially since
that includes code in drivers.

I looked into implementing your suggestion, but unfortunately I ran
out of time.  Too much work related travel this time of year.  And I
still need to test your libpciaccess changes on OpenBSD as well :(.
Hopefully I'll have time for this at the end of november.

But I'd really like to keep this option open, and Tiago's diff seems
to remove some of the infrastructure for this.

> That's cheap enough, but you're still bounded by the time required
> to actually complete a request, and it's pretty trivial to get that
> to spike well north of 100ms.  Even discounting pathological
> clients, it's pretty easy for something like firefox to submit
> enough work to keep you away from dispatch for multiple frame
> intervals.  You're not necessarily doing anything besides moving the
> pointer during that, but that's not a reason to let the pointer
> skip.

But a thread-based approach would suffer from the same issues.  You'll
need to protect large parts of the code dealing with request handling
with a mutex.  At the very minimum, all the bits that do hardware
access need to be protected (something Tiago's diff doesn't seem to
do).  Now you can probably do the locking a bit more fine-grained than
on a per-request basis to reduce latency a bit.  But that makes it
harder to get the locking right.  And it isn't clear to me you can
avoid the +100ms spikes you mention.

Cheers,

Mark


More information about the xorg-devel mailing list