[RFC] Multitouch support, step one

Peter Hutterer peter.hutterer at who-t.net
Sun Mar 14 23:56:05 PDT 2010

Alrighty, I've been thinking some more about multitouch and here's my
current proposal:

Good news first - we can probably make it work.
Bad news second - not quite just yet, not without kludges.

Multi-touch as defined in this proposal is limited to single input-point
multi-touch. This is suitable for indirect touch devices (e.g. touchpads)
and partially suited for direct touch devices provided a touch is equivalent
to a single-gesture single-application input.

"true" multitouch, i.e. multiple independent input points across multiple
client is not covered here, at this point this problem is unsolved.
The trick is to get us the former, without limiting future use of the

I believe this is pretty much what Win 7 or OS X do so I won't bother
claiming this being innovative. This isn't exactly my idea either, I'm just
writing up what I got from talking to Benjamin, Bradley, Henrik, Stephane,
and many more.

The data we get from the (Linux) kernel includes essentially all the ABS_MT
events, x, y, w, h, etc. We can pack this data into valuators on the device.
In the simplest case, a device with two touchpoints would thus send 4
valuators - the first two being the coordinate pair for the first touch
point, the latter two the coordinates for the second touch point.

XI2 provides us with axis labels, so we can label the axes accordingly.
Clients that don't read axis labels are left guessing what the fancy values
mean, which is exactly what they're doing already anyway.

XI2 DeviceEvents provide a bitmask for the valuators present in a device.
Hence, a driver can dynamically add and remove valuators from events, thus
providing information about the presence of these valuators.
e.g. DeviceEvent with valuators [1-4] means two touchpoints down, if the
next event only includes valuators [3-4], the first touchpoint has

Core requires us to always send x/y, hence for core emulation we should
always include _some_ coordinates that are easily translated. While the
server does caching of absolute values, I think it would be worthwile to
always have an x/y coordinate _independent of the touchpoints_ in the event.
The driver can decide which x/y coordinates are chosen if the first
touchpoint becomes invalid.

Hence, the example with 4 valuators above becomes a device with 6 valuators
instead. x/y and the two coordinate pairs as mentioned above. If extra data
is provided by the kernel driver, these pairs are simple extended into
tuples of values, appropriately labeled.

Core clients will ignore the touchpoints and always process the first two
XI1 clients will have to guess what the valuators mean or manually set it up
in the client.
XI2 clients will automagically work since the axes are labeled. Note that
any client that receives such an event always has access to _all_
touchpoints on the device. This works fine for say 4-finger swipes on a
touchpad but isn't overly useful for the multiple client case, see
Since additional touchpoints are valuators only, grabs work as if the
touches belong to a single device. If any client grabs this device, the
others will miss out on the touchpoints.

XI2 allows devices to change at runtime. Hence a device may add or remove
valuators on-the-fly as touchpoints appear and disappear. There is a chance
of a race condition here. If a driver decides to add/remove valuators
together with the touchpoints, a client that skips events may miss out.
e.g. if a DeviceChanged event that removes an axis is followed by one that
adds an axis, a client may only take the second one as current, thus
thinking the axis was never removed. There is nothing in the XI2 specs that
prohibits this. Anyways, adding removing axes together with touchpoints
seems superfluous if we use the presence of an axis as indicator for touch.
Rather, I think a device should be set up with a fixed number of valuators
describing the default maximum number of touchpoints. Additional ones can be
added at runtime if necessary.

Work needed:
- drivers: updated to parse ABS_MT_FOO and forward it on.
- X server: the input API still uses the principle of first + num_valuators
  instead of the bitmask that the XI2 protocol uses. These calls need to be
  added and then used by the drivers.
- Protocol: no protocol changes are necessary, though care must be taken in
  regards to XI1 clients. 
  Although the XI2 protocol does allow device changes, this is not specified
  in the XI1 protocol, suggesting that once a device changes, potential XI1
  clients should be either ignored or limited to the set of axes present
  when they issued the ListInputDevices request. Alternatively, the option
  is to just encourage XI1 clients to go the way of the dodo.

Corner cases:
We currently have a MAX_VALUATORS define of 32. This may or may not be
arbitrary and interesting things may or may not happen if we increase that.

A device exposing several axes _and_ multitouch axes will need to be
appropriately managed by the driver. In this case, the "right" thing to do
is likely to expose non-MT axes first and tack the MT axes onto the back.
Some mapping may need to be added.

The future addition of real multitouch will likely require protocol changes.
These changes will need to include a way of differentiating a device that
does true multitouch from one that does single-point multi-touch.

That's it, pretty much (well, not much actually). Feel free to poke holes
into this proposal.


