[PATCH inputproto xi 2.1] Updates for pointer emulation and more touch device modes

Peter Hutterer peter.hutterer at who-t.net
Mon Mar 7 21:41:46 PST 2011

On Wed, Mar 02, 2011 at 11:35:41AM -0500, Chase Douglas wrote:
> On 03/02/2011 05:58 AM, Daniel Stone wrote:
> > On Tue, Feb 22, 2011 at 10:06:37AM -0500, Chase Douglas wrote:
> >> --- a/XI2.h
> >> +++ b/XI2.h
> >> @@ -32,6 +32,7 @@
> >>  #define Dont_Check                              0
> >>  #endif
> >>  #define XInput_2_0                              7
> >> +#define XInput_2_1                              8
> > 
> > Peter, what was happening with this hunk?
> > 
> >> @@ -132,16 +133,16 @@
> >>  /* Device event flags (common) */
> >>  /* Device event flags (key events only) */
> >>  #define XIKeyRepeat                             (1 << 16)
> >> -/* Device event flags (pointer events only) */
> >> +/* Device event flags (pointer and touch events only) */
> >>  #define XIPointerEmulated                       (1 << 16)
> >>  /* Device event flags (touch events only) */
> >> -#define XITouchPendingEnd                       (1 << 16)
> >> -/* Device event flags (touch end events only) */
> >> -#define XITouchAccepted                         (1 << 17)
> >> +#define XITouchPendingEnd                       (1 << 17)
> > 
> > Is there any particular reason to unify the two sets of flags? I guess
> > it can't really hurt as we shouldn't be using sixteen flags between
> > pointer and touch, but eh.
> We don't have to unify them per se, we could leave them independent.
> However, we need XIPointerEmulated for both touch events and pointer
> events. As an example, through Qt one can query which touch point is
> associated with an emulated pointer event. To do this, we need a way to
> designate that touch in the protocol. Reusing the XIPointerEmulated flag
> for the associated touch seems to be a reasonable solution to me.

I'm not sure how well that works in practice given that pointer events may
be held up by grabs, whereas touch events are delivered immediately. so even
if you have both pointer and touch event flagged, you may not get one event
until much later.

> >> @@ -205,62 +208,145 @@ to touch the device. The init and destroy stage of this sequence are always
> >>  present, while the move stage is optional. Within this document, the term
> >>  "touch sequence" is used to describe the above chain of events. A client
> >>  wishing to receive touch events must register for at least TouchBegin,
> >> -TouchOwnership, TouchUpdate, and TouchEnd simultaneously; it may also select
> >> -for TouchUpdateUnowned events if it wants to receive the full touch stream,
> >> -rather than just the final state.
> >> +TouchUpdate, and TouchEnd simultaneously. It may also select for
> >> +TouchUpdateUnowned and TouchOwnership events if it wants to receive the full
> >> +touch stream while other clients own or have active grabs involving the touch.
> > 
> > I'm not particularly happy with this hunk, as it means we'll be
> > delivering TouchOwnership events to clients who haven't selected for
> > them.  I think it was fairly clear as it is: you must always select for
> > TouchBegin, TouchOwnership, TouchUpdate and TouchEnd.  If you also want
> > unowned events, you select for TouchUpdateUnowned as well.
> When would we ever need to send an ownership event if the client didn't
> select for it? If you don't select for ownership and update unowned, you
> won't receive any events until you have become the owner of the touch.
> When you receive the begin event, you already know you're the owner, so
> an ownership event isn't needed.

daniel's approach requires that the touchbegin is sent immediately to all
clients, the ownership when a client receives the ownership. your approach
holds the touch begin until the client becomes the owner, thus being
more-or-less the equivalent of the ownership event in daniel's apparoch.

> >> +grab. When a client does not select for unowned and ownership events, it will
> >> +receive a TouchBegin event when it becomes the owner of a touch stream.
> >> +TouchUpdate and TouchEnd events will be received in the same manner as for touch
> >> +grabs.
> > 
> > I think it could be clearer to state that:
> >     * clients always receive TouchBegin events immediately before they
> >       start receiving any other events for that touch sequence
> >     * TouchUpdateUnowned events, if selected for, will be sent while the
> >       client does not own the touch sequence
> >     * a TouchOwnership event will be sent when the client becomes the
> >       owner of a touch stream, followed by a sequence of TouchUpdate
> >       events
> >     * a TouchEnd event will be sent when no further events will be sent
> >       to this client for the touch sequence: when the touch has
> >       physically ended, when the client has called AllowTouchEvents with
> >       TouchRejectEnd, when the touch grab owner has called
> >       AllowTouchEvents with TouchAccept, or the pointer grab owner has
> >       called AllowEvents with Async{Pointer,Both}.
> This doesn't match what I wrote above :). As I noted in an earlier
> comment, we don't need to send ownership events to clients that don't
> select for unowned events. This makes the client code much cleaner too,
> as they will only have to handle begin, update, and end events.

the danger I see in your spec however is that there is no clear mapping
between the actual touch begin and the one sent to the client. does the
TouchBegin still contain the original coordinates or the current ones? what
about update events that happened between the physical begin and the
TouchBegin. are they buffered and re-sent or just dropped or compressed. 
you mentioned dropping with your ring buffer, but that's an implementation
detail not explained elsewhere.

does the TouchBegin have the same timestamp as the actual touch begin or the
timestamp of when sent to the client?
for delayed touches (because a grabbing client takes a while), the
time between TouchBegin and TouchOwnership can be a worthy piece of
information that is otherwise not available. The mere fact that touch is
currently used can be interesting to a client, even if it never receives the
touch event.

> >> +SemiMultitouch:
> >> +    These devices may report touch events that correlate to the two opposite
> >> +    corners of the bounding box of all touches. The number of active touch
> >> +    sequences represents the number of touches on the device, and the position
> >> +    of any given touch event will be equal to either of the two corners of the
> >> +    bounding box. However, the physical location of the touches is unknown.
> >> +    SemiMultitouch devices are a subset of DependentTouch devices. Although
> >> +    DirectTouch and IndependentPointer devices may also be SemiMultitouch
> >> +    devices, such devices are not allowed through this protocol.
> > 
> > Hmmm.  The bounding box being based on corners of separate pointers
> > seems kind of a hack to me.  I'd much rather have the touches all be
> > positioned at the midpoint, with the bounding box exposed through
> > separate axes.
> I think the question that highlights our differences is: "Should we
> attempt to handle these devices in the XI 2.1 touch protocol, or fit
> them into the pointer protocol?" In Linux, it's been determined that
> these devices will be handled as multitouch devices. The evdev client
> sees a device with two touch points that are located at the corners of
> the bounding box. The normal synaptics-style event codes for describing
> the number of fingers are used to denote how many touches are active in
> the bounding box.
> I'm of the mindset that these devices should be handled as described in
> XI 2.1. However, I could be persuaded to handle these devices by
> treating them as traditional pointing devices + 5 valuators for
> describing the bounding box and how many touches are active.
> > The last sentence also makes me slightly nervous; it seems like we want
> > SemiMultitouch to actually be an independent property, whereby a device
> > is Direct, Independent or Independent, and then also optionally
> > semi-multitouch.  (Possibly just exposing the bounding box axes would be
> > enough to qualify as semi-multitouch.)  In fact, IndependentPointer
> > could be similarly be a property of some DependentTouch devices as well.
> I thought about this, but there's a few reasons I did it this way:
> 1. If you want to make it an independent property, then we should change
> the mode field to a bitmask. The field is only 8 bits right now, so we
> could run out of bits very quickly. However, treating the field as an
> integer as it is today allows for 255 variations. We can always revisit
> and add in semi-mt + independent pointer as a new mode later on.
> 2. Semi-mt and direct touch doesn't make sense. You don't know where
> touches are, so you don't know which window to direct events to if the
> bounding box spans multiple windows.
> 3. I believe semi-mt is a dead technology now. I've only ever seen it in
> touchpads, and I don't think they'll ever expand beyond that scope. We
> can always add another device mode if needed.
> >> +A device is identified as only one of the device modes above at any time. For
> >> +the purposes of this protocol, IndependentPointer and SemiMultitouch devices are
> >> +treated the same as DependentTouch devices unless stated otherwise.
> > 
> > It would be nice to either go through and clarify every one of these
> > cases, or if we end up keeping these two as separate classes, introduce
> > new unambiguous terminology for the set of all three classes.
> A good idea. I'll try to think of a better naming scheme.
> >> +In order to prevent touch events delivered to one window while pointer events
> >> +are implicitly grabbed by another, all touches from indirect devices will end
> >> +when an implicit grab is activated on the slave or attached master device. New
> >> +touches may begin while the device is implicitly grabbed.
> > 
> > This bit makes me _nervous_.  Unfortunately we can only activate one
> > pointer grab at a time, but I'd rather do something like this:
> >     * populate the window set with the pseudocode described near the top
> >       when the touch begins, regardless of the pointer state
> >     * generate touch events as normal
> >     * if ownership is passed to a pointer grab/selection, skip it if
> >       a pointer grab is already active on the delivering device (the MD
> >       if the selection was on the MD ID or XIAllMasterDevices, otherwise
> >       the SD)
> > 
> > It's unpleasant, but I don't like ending all touch events as soon as we
> > start pointer emulation (which will happen a fair bit).  Also: why is
> > this different for direct and indirect devices? Doesn't this completely
> > kill multi-finger gestures if _any_ client (e.g. the WM) has a pointer
> > grab anywhere in the stack?
> > 
> > This bit will definitely require more thought.
> I think you're mixing up a lot of things here :). First, we're only
> talking about indirect devices where there's no pointer emulation.
> Second, we're only talking about implicit grabs that are activated when
> a button is pressed.

this needs to be specified then. AIUI, sending TouchBegin events activates
implicit grabs too.

> However, this does bring up a good point. What do we do when a touch
> begins on an indirect device that is actively grabbed. What do we do
> when a grab is activated?
> I feel as though the only sound thing to do for indirect devices is to
> cancel all touches when any grab is activated, and to not begin any
> touch sequences while any grab is active. This is an extremely heavy
> handed solution to the problem, but I can't think of anything better
> that wouldn't introduce holes into the protocol. Further, there are
> normally two scenarios where grabs are used:
> 1. When a button is pressed. For all multitouch gesture work I've seen
> (and I'm unaware of any other usage of multitouch for indirect devices),
> no button are pressed while multitouch events are being handled.

tapping and scrolling both send button events that will likely be grabbed,
even if temporarily only. that's usually on the MD though.

> 2. When doing funky things like confine-to. Hopefully pointer barriers
> are a better solution for this, so we can just say we don't support MT +
> pointer grabs.

hoping that confine_to just disappears is not a good plan of action,
regardless of pointer barriers.

> Based on all this, I don't think we'll be missing that much if we go
> with this approach. Our hands are tied by legacy X protocol choices, and
> this isn't the only compromise we're making :).
> >> +Many touch devices will emit pointer events as well, usually by mapping one
> >> +touch sequence to pointer events. In these cases, events for both the pointer
> >> +and its associated touch sequence will have the XIPointerEmulated flag set.
> > 
> > I think we can move this section into pointer emulation, and make sure
> > that it's clearly stated that all pointer events from touch devices will
> > be emulated.
> +1 on moving the text. However, the second point isn't true. An
> independent pointer device does not emulate any pointer events.
> >> +4.4.4 Pointer emulation for direct touch devices
> >> +
> >> +In order to facilitate backwards compatibility with legacy clients, direct touch
> >> +devices will emulate pointer events. Pointer emulation events will only be
> >> +delivered through the attached master device; no pointer events will be emulated
> >> +for floating touch devices. Further, only one touch from any attached slave
> >> +touch device may be emulated per master device at any time.
> > 
> > Indirect devices won't do pointer emulation? How about touchpads?
> I think this is a semantics issue that should be addressed. Direct touch
> devices perform pointer emulation in a specific manner as outlined here.
> Indirect devices have pointer emulation of sorts, but there's nothing
> special about it.

Then this needs to be stated in the spec. 
"Independent touch devices do not feature pointer emulation, the device is
expected to provide x and y coordinates through conventional axes."
> >> +A touch event stream must be delivered to clients in a mutually exclusive
> >> +fashion. This extends to emulated pointer events. For the purposes of
> >> +exclusivity, emulated pointer events between an emulated button press and
> >> +button release are considered. An emulated button press event is considered
> >> +exclusively delivered once it has been delivered through an event selection, an
> >> +asynchronous pointer grab, or it and a further event are delivered through a
> >> +synchronous pointer grab.
> > 
> > 'in a mutually exclusive fashion': could you elaborate?
> I thought the rest of the paragraph was the elaboration you are looking
> for. What do you feel is missing?
> >> +Touch and pointer grabs are also mutually exclusive. For a given window, any
> >> +touch grab is activated first. If the touch grab is rejected, the pointer grab
> >> +is activated. If an emulated button press event is exclusively delivered to the
> >> +grabbing client as outlined above, the touch sequence is ended for all clients
> >> +still listening for unowned events. Otherwise, when the pointer stream is
> >> +replayed the next window in the window set is checked for touch grabs.
> > 
> > Buh.  If we're going to do this, we might as well allow multiple touch
> > selections on the same window (e.g. if there are grabs on both the slave
> > ID and XIAllDevices, deliver first to the slave grab, then to
> > XIAllDevices).  Not that that's necessarily a bad idea, mind, but I'd
> > like some consistency between touch and pointer here: either one grab
> > per window, or multiple.
> It is my understanding that only one client may grab a device per
> window, which also means one client can't grab XIAllDevices while
> another grabs a specific device.

fwiw, XIGrabDevice(XIAllDevices) will always fail with BadDevice.
for passive grabs, the above is correct.
> The only other point here is whether one client can grab the master
> device while another client grabs the slave device. However, when a
> slave device is grabbed it is detached from the master device. So I
> think the point is moot.
> >> +If the touch sequence is not exclusively delivered to any client through a grab,
> >> +the touch and emulated pointer events may be delivered to clients selecting for
> >> +the events. Event propagation for the touch sequence ends at the first client
> >> +selecting for touch and/or pointer events. Note that a client may receive both
> >> +touch and emulated pointer events for the same touch sequence through event
> >> +selection.
> > 
> > Oh? So if someone has selected for both pointer and touch events on the
> > same window, they receive both the touch events and the emulated pointer
> > stream? How about if different clients select on the window? How does
> > that work given that clients with selections cannot currently assert or
> > reject ownership? Surely both the touch and pointer selections will then
> > think they're the owner ... so either we're pointlessly delivering both
> > the touch events and the emulated pointer events to the same client, or
> > two clients think they're the owner of the touch stream.  Either way,
> > it's bad news.
> The X protocol has always had this property that if you select for
> pointer events, you can't assume exclusivity of event delivery. This is
> in contrast to pointer grabs, where you do have exclusivity.

this only applies to Motion and Release events, not to Press events though.
any client that selects for ButtonPress events expects exclusivity since it
triggers an implicit passive grab. 
assume we have two clients selecting for pointer and touch events
respectively.  if we always deliver touch events first, I don't know how we
can emulate pointer events to two clients since the device is already
grabbed by then.

this pretty much comes down to two things:
- we should specify that only one client may select for touch events on a
  given window, just like for button press (I _think_ we may have this in
  the protocol already)
- we need to decide if pointer emulation happens if the client selects for
  pointer + touch events or if we trust the client to handle this situation

> There's nothing that prevents one client from selecting for touches
> while another client selects for pointer events on the same window.
> However, there is a clear distinction: the pointer selecting client
> knows that it may not be the only receiver of events, while the touch
> selecting client knows it has exclusive right to the touch events.
> Also, delivering an emulated pointer and its associated touch event
> isn't pointless. It's how Windows handles things today, so toolkits like
> Qt are set up to deal with this situation. One could argue that Qt
> could/should be handling things differently for XI 2.1, but I don't have
> a good argument why we should force them to.

what do they do with the emulated pointer event? do they process it or
discard it anyway?

> >> @@ -866,6 +949,9 @@ are required to be 0.
> >>      master
> >>          The new master device to attach this slave device to.
> >>  
> >> +    If any clients are selecting for touch events from the slave device, their
> >> +    selection will be canceled.
> > 
> > Does that mean the selection will be removed completely, and the
> > selection will no longer be present if the SD is removed, and all
> > clients are required to re-select every time the hierachy changes, or?
> If the SD is removed, then all event selections are already canceled
> aren't they? If not, that seems like a broken protocol. Device IDs are
> reused, so you might end up selecting for events from a different device
> than you meant to.
> Clients only are required to re-select when the specific slave device
> they care about is attached, not on every hierarchy change.

I guess daniel meant s/removed/reattached/, not as in "unplugged". But you
answered the question, a client registering for touch events must re-select
for touch events on every hierarchy change that affects the SD (including
the race conditions this implies).

What is the reason for this again? If we already require clients to track
the SDs, can we assume that they want the events from the device as
selected, even if reattached?

> > I'd prefer to just remove this bit completely.
> Got any other suggestion? This is due to the fact that only one client
> may select for touch events on a window from a device at a time. When
> you attach, this rule could be broken unless you do something about it.
> >> @@ -1538,9 +1624,9 @@ are required to be 0.
> >>      sequence to direct further delivery.
> >>  
> >>      deviceid
> >> -        The grabbed device ID.
> >> +        The slave device ID for a grabbed touch sequence.
> >>      touchid
> >> -        The ID of the currently-grabbed touch sequence.
> >> +        The ID of the touch sequence to modify.
> > 
> > Good catches, thanks.
> > 
> > The rest looks fairly solid to me, although I'm worried enough about the
> > above - and particularly how we'll handle delivery/pointer emulation
> > when a pointer grab is already active on the device - that I really
> > don't want to cut an RC now.  I don't think we can really commit to
> > semantics for a lot of this until we've seen a working implementation
> > with a full stack; at the moment, we don't have one upstream, and
> > Ubuntu's seems to be in enough flux that I don't think it's settled down
> > enough to be able to say that the semantics are necessarily what we
> > want.
> There's no need for an rc per se, I just thought some sort of upstream
> release would be helpful, even if it's just called an "alpha". We're
> getting by without an official upstream release of any sort, so this
> isn't a huge deal.
> As for our stack, it's pretty settled for most uses at this point.
> Remember that most clients will just be selecting for begin, update,
> end. For example, we have Qt with multitouch in ubuntu, and it only
> selects for those three events. Direct touch devices in particular work
> very well. The comments above highlight issues with indirect devices,
> but they are corner cases that don't really come up much in usage. If we
> discard touches during active indirect device pointer grabs, I think
> we'll cover 99% of the use cases, and it's a pretty simple change.
> As for what happens when using a direct device during an active pointer
> grab, we essentially skip all grabs above the pointer grab window and
> continue from there. This should be noted in the spec somewhere, but in
> practice it works.
> To give an idea of the breadth of testing we've got so far, here's one
> thing we've been doing:
> Compiz plugin with a touch grab on the root window that always rejects
> Compiz has a passive grab on non-focused windows
> Qt fingerpaint application has touch selections
> Qt fingerpaing application actively grabs when you click on a drop down menu
> The active grab for the drop down menu works when you touch on the menu
> title, drag to the menu selection you want, and release. It also works
> when you tap it, a grab is placed on the pointer, and then you tap again
> on the menu selection you want or anywhere else on screen.
> I know the protocol document is in flux, but in reality I think we are
> very close to a fully working implementation.
> > Sorry this has taken so long, but some of it was non-trivial.  I'll try
> > to be more responsive from now on and let you know when I'm back working
> > on the implementation, but that's probably 2-3 weeks away at the moment.
> > 
> > Cheers,
> > Daniel
> > 
> > [0]: The Logitech Air mouse would qualify, but it seems to do its
> >      gesture recognition (for scroll events only) in hardware, so.
> >      Also, it's crap.  I bought it to make it work under Linux, turns
> >      out it was actually perfectly HID-compliant and worked out of the
> >      box, but was basically unusable and pointless.  $100 down the
> >      drain.  But I digress.
> MS has two touch surface mice, one already available. Another company
> I've never heard of before has a similar mouse, but I can't remember any
> other details.

More information about the xorg-devel mailing list