[PATCH inputproto xi 2.1] Updates for pointer emulation and more touch device modes

Mon Mar 7 15:07:57 PST 2011

On 03/07/2011 01:28 AM, Peter Hutterer wrote:
> On Wed, Mar 02, 2011 at 09:25:40AM -0500, Chase Douglas wrote:
>> On 03/02/2011 02:29 AM, Peter Hutterer wrote:
>>> On Tue, Feb 22, 2011 at 10:06:37AM -0500, Chase Douglas wrote:
>>>>  Touch grabs are similar to standard input event grabs in that they take
>>>>  precedence over selections and are searched from the root window to the child
>>>>  window.  The first grab found for a touch sequence will be the owner of that
>>>>  touch sequence, however events for that touch sequence will continue to be
>>>> -delivered to all clients with grabs in the window tree, as well as the client
>>>> -with the deepest selection.  The first client may either “accept” the touch,
>>>> -which claims the touch sequence and stops delivery to all other clients for
>>>> -the duration of the touch sequence, or “reject” the touch sequence, which
>>>> +delivered to all clients with grabs in the window tree, as well as potentially
>>>
>>> why "potentially"? I thought the client with the deepest selection would
>>> always get the event.
>>
>> I wasn't sure how to work this in, and "potentially" seemed the easiest
>> if not the clearest :). Clients can select for touch events without
>> selecting for unowned events. Thus, the client with the deepest
>> selection may or may not receive the event depending on whether unowned
>> events are selected for. This seemed glossed over without adding
>> "potentially", but perhaps adding "potentially" isn't a satisfactory
>> addition either. I'll try to think of a better way of stating this.
> 
> 
> simply removing the potentially should do then. the "selection" bit already
> covers that only selected events will be sent - this is standard practice.

Ok

>>>> +the client with the deepest selection.  The first client may either “accept” the
>>>> +touch, which claims the touch sequence and stops delivery to all other clients
>>>> +for the duration of the touch sequence, or “reject” the touch sequence, which
>>>>  will remove that client from the delivery set and pass ownership on to the
>>>> -next client.
>>>> +next client. When a client, including the initial owner, becomes the owner of a
>>>> +touch, it will receive a TouchOwnership event. When an owning client accepts a
>>>> +touch, further clients receiving unowned events will receive TouchEnd events.
>>>>  
>>>> -Window sets for direct device touches contain the windows from the root to the
>>>> -child in which the touch originated.
>>>> +Clients selecting for touch events may select for either unowned events or only
>>>> +owned events. 
>>>
>>> wait, unowend and owned is mutually exclusive? this is a rather severe
>>> change to Daniel's last spec - why the change of heart?
>>
>> I'm not sure I understand what you're implying, but it's probably due to
>> the wording. Logically, it's: (select for unowned and eventually owned
>> events || select for owned events only). I'll try to think of a better
>> phrasing for this.
> 
> "may select for unowned events, owned events or both." should cover us then.

Sounds good.

>>>> The event stream for an unowned selection is identical to a touch
>>>> +grab. When a client does not select for unowned and ownership events, it will
>>>> +receive a TouchBegin event when it becomes the owner of a touch stream.
>>>
>>> this means you have to buffer Begin + Update events in the server until the
>>> ownership has the right client. Which, IIRC, is the reason we decided not to
>>> re-implement the pointer grab's serial semantics but the parallel ones
>>> sending to all clients.
>>
>> My implementation has a bounded ring buffer with N events for each
>> touch. If you overrun the ring buffer, then you'll get the touch begin
>> event, you'll miss some events from the beginning of the sequence, and
>> then you'll get the last N events.
>>
>> The reason this was added was to reduce the need for clients to listen
>> for unowned events, who may be woken up on every touch event yet never
>> become the owner. This can be a power drain on mobile devices.
>>
>> I've been meaning to add a bit of text saying that clients selecting
>> only for owned events may miss touch update events from the beginning of
>> the sequence.
> 
> so let me rephrase this. unowned events are sent to any client selecting.
> owned events are sent to the client if .... I'm now missing this bit here.
> assuming that you have a TouchBegin and N TouchUpdate events, with one or
> more grabbing clients above you. At what point does a client get which
> event?

There's a few sets of touch events. Touch events consist of:

TouchBegin
TouchUpdate
TouchUpdateUnowned
TouchOwnership
TouchEnd

If you select for unowned events, you may receive all of these events.
You will receive a TouchBegin event when the touch sequence begins,
TouchUpdateUnowned events while you are not the owner, then a
TouchOwnership event when you become the owner, then more TouchUpdate
events, and then finally a TouchEnd event.

If you select only for owned events, you may receive:

TouchBegin
TouchUpdate
TouchEnd

In this case, you receive a TouchBegin event when you become the owner
of the touch sequence, and then you receive any TouchUpdate events that
were generated and enqueued while some other client owned the touch
sequence.

The two selections above are the only two valid selections for any touch
events. A client cannot select for just TouchBegin, TouchUpdateUnowned,
and TouchEnd, for example.

> daniel's spec has an example for a client that selects for both owned and
> unowned but I don't know anymore what happens if a client doesn't select for
> owned.

This is a patch against his spec, so what you read should be in here
somewhere :).

You can't select for any touch events without also selecting for owned
events. The only distinction is whether you want unowned events as well.

>>>> +4.4.2 Touch device modes
>>>> +
>>>> +Touch devices come in many different forms with varying capabilities. The
>>>> +following device modes are defined for this protocol:
>>>> +
>>>> +DirectTouch:
>>>> +    These devices map their input region to a subset of the screen region. Touch
>>>> +    events are delivered according to where the touch occurs in the mapped
>>>> +    screen region. An example of a DirectTouch device is a touchscreen.
>>>> +
>>>> +DependentTouch:
>>>> +    These devices do not have a direct correlation between a touch location and
>>>> +    a position on the screen. Touch events are delivered according to the
>>>> +    location of the pointer on screen. An Example of a DependentTouch device
>>>> +    is a trackpad.
>>>> +
>>>> +IndependentPointer:
>>>> +    These devices do not have any correlation between touch events and pointer
>>>> +    events. IndependentPointer devices are a subset of DependentTouch devices.
>>>> +    An example of an IndependentPointer device is a mouse with a touch surface.
>>>
>>> I don't quite understand what the difference to DependentTouch is here.
>>> To me, a mouse with a touch surface is identical to a trackpad in how
>>> gestures would be interpreted. At least that's how I'd use it, so having
>>> this as a separate device type seems confusing.
>>
>> I'll take Qt as an example. If you have a touchscreen device, it passes
>> each touch to the widget they landed on. If you have a touchpad, by
>> default they don't send any touch events until two or more touches are
>> active. This is due to the duality of the touchpad as a pointer
>> controlling device and as a touch surface.
>>
>> Although they don't have any support for other types of devices yet, I
>> would assume they would handle IndependentPointer devices differently. I
>> don't see any reason for withholding the first touch for these devices,
>> so Qt must have a way of knowing about these devices.
> 
> uhm, I still don't know what the difference between the two devices is. The
> two examples above are a mouse and a trackpad but I don't see the difference
> here. Both are pointing devices with touch surfaces.

There's no difference in the X server handling. The difference is on the
client side. The client should know the type of device from the XI
protocol. It shouldn't have to fish for any data from anywhere else as
it won't be portable. Thus, we must make a distinction between trackpads
and mice with a touch surface.

For example, on a trackpad the user would like to scroll with two
fingers. On a mouse with a touch surface, the user would like to scroll
with one finger. A distinction between the two types of devices is required.

>>>> +4.4.4 Pointer emulation for direct touch devices
>>>> +
>>>> +In order to facilitate backwards compatibility with legacy clients, direct touch
>>>> +devices will emulate pointer events. Pointer emulation events will only be
>>>> +delivered through the attached master device; 
>>>
>>> so a pointer emulation event cannot be used on the SD? is this correct?
>>
>> I don't see any need for pointer emulation on an SD, especially when the
>> device isn't really emitting pointer events.
> 
> any pointer-like SD is currently sending pointer events (XI1 and XI2 but not
> core). so we need to make a decision of whether pointer emulation will not
> be available on SDs. which simply means that any XI1 or XI2.0 client cannot
> use device-specific events on touch devices for pointer emulation (gimp and
> GTK3 apps that use XI2).

First, I tried out gimp and it seems broken in Ubuntu Natty :). I can't
seem to get it to work the way I expect it to. Here's what I did:

1. Detach touchpad from master pointer
2. In gimp preferences, set the trackpad to "window" mode

I expected the trackpad surface to be mapped to the canvas, so if I tap
in the top right corner it marks the top right corner of the canvas.
However, the tool still operated in a relative fashion, and very buggily
at that. I'm going to assume my expectations are correct and gimp is
currently broken :).

I assume everything will continue to work properly in this case because
indirect devices send pointer events that are independent in protocol
from the touch events. Thus, SD indirect touch devices still send
pointer events.

This particular section is referring to pointer emulation only for
direct devices. I don't see any reason why one would need pointer events
from a slave direct touch device. The touch surface is already mapped to
screen coordinates. The above usage in gimp is the only use case I'm
aware of for selecting for device events from slave devices instead of
master devices, and it's moot for direct touch devices.

>>> dependent devices like the magic trackpad. urgh.
>>
>> Do we need to say anything? There's no pointer "emulation", per se, for
>> trackpads. Pointer events are sent independently through X for these
>> devices. The only distinction is that there is usually, but not required
>> to be, a logical mapping between one of the touches and the pointer motion.
> 
> yes, we need to say at least "there is no pointer emulation for dependent
> devices" :) the spec should leave as little room for interpretation as
> possible.

Ok

>>>> + no pointer events will be emulated
>>>> +for floating touch devices. Further, only one touch from any attached slave
>>>> +touch device may be emulated per master device at any time.
>>>
>>>
>>>> +
>>>> +A touch event stream must be delivered to clients in a mutually exclusive
>>>> +fashion. This extends to emulated pointer events. For the purposes of
>>>> +exclusivity, emulated pointer events between an emulated button press and
>>>> +button release are considered. An emulated button press event is considered
>>>> +exclusively delivered once it has been delivered through an event selection, an
>>>> +asynchronous pointer grab, or it and a further event are delivered through a
>>>> +synchronous pointer grab.
>>>
>>> the last sentence is the meat of the paragraph here, the first three simply
>>> served to confuse me :)
>>
>> I believe it's all necessary. Hopefully the following example will show why:
>>
>> phyiscal touch begin
>> emulated pointer motion
>> emulated pointer button press
>> touch begin
>> physical touch move
>> emulated pointer motion
>> touch update
>> physical touch end
>> emulated pointer button release
>> touch end
>>
>> The very first emulated pointer motion moves the cursor to where the
>> touch begins on the screen. This event is not considered for
>> exclusivity. It will always be sent whether or not the touch sequence or
>> further emulated pointer events are delivered. This is why the second
>> sentence is needed.
> 
> as I understand this atm, the spec requires that an event is delivered
> either as touch event or as pointer event to a client depending on the
> grab/event selection, with the touch event always having preference. 
> this goes for motion events as well. Once a touch begin or button press is
> delivered, it establishes a grab anyway, so the exclusivity is handled by
> the grab.

A touch begin event generates two pointer events: pointer motion to move
the cursor to the location of the touch, and button press. However, you
won't be hitting any passive pointer grabs until the button press event.
It sounds like you're saying that if a touch begin is grabbed, it should
inhibit the first pointer motion event too. But think about this
sequence of events:

Assumption: If there's only one window under the touch and it's
selecting for both pointer motion and button press, a touch begin must
generate a pointer motion event and then a button press event in that order.

Root window R has a passive button grab
Window A selects for pointer motion and button events
Subwindow B has a touch grab

When you initiate a touch, touch begin, pointer motion, and button press
events are generated. How do you deliver these events to these windows?
You need to service the root window passive button grab first before the
touch grab. But you've generated the pointer motion event before the
button press event, so you've got to send the pointer motion event
first. This event will fall to window A's selection.

You can't do anything else without discarding the pointer motion when
servicing a passive grab. If you tried to do that, you'd have to
completely rework the server's input queue too. The protocol would also
have to allow for a button press event to move the cursor to the new
position as well, which I don't think it does. And if the protocol did
allow for it, I'm not sure how many apps would break if we started
sending button press events with a motion event to move the cursor to
the location of the button press.

The simplest solution is to just give up on exclusivity for the initial
pointer motion event; it must be generated and it must be sent. Don't
consider it part of the exclusive stream of events when handling touch
and pointer grabs.

Related, for consistency I think the emulated touch should always move
the cursor, even if only touch events are selected for. Just a parting
thought :).

>>>> +Touch and pointer grabs are also mutually exclusive. For a given window, any
>>>> +touch grab is activated first. If the touch grab is rejected, the pointer grab
>>>> +is activated. If an emulated button press event is exclusively delivered to the
>>>> +grabbing client as outlined above, the touch sequence is ended for all clients
>>>> +still listening for unowned events. 
>>>
>>> this means you're losing touch events between the grab activating and 
>>> AllowEvents(ReplayPointer).
>>
>> How? (I've got this working here just fine, with both active and passive
>> pointer grabs and touch grabs and selecting clients)
> 
> AllowEvents(ReplayPointer) is the semantic equivalent to touch rejection, so
> you cannot end the touch sequence until the client actually terminates the
> button grab. grabs can be nested, so delivering a button event to a client
> mustn't stop touch events delivered to the next client unless you know that
> the client doesn't consume the event.

AllowEvents(ReplayPointer) is similar to touch event rejection, but you
can revocably reject a touch sequence after multiple events have been
delivered, whereas you can't do the same with a synchronous pointer grab.

If a button press and then a pointer motion event has been delivered
through a synchronous grab, there's no way to undo the button press
event. The button press event has been sent and handled. Thus, part of
the touch sequence has been irrevocably handled, so any clients who do
not own the sequence are barred from handling the sequence no matter
what the grabbing client does from now on. Once we are in this
situation, we should end the touch sequence for all the touch clients
still listening to the unowned events.

>>>> Otherwise, when the pointer stream is
>>>> +replayed the next window in the window set is checked for touch grabs.
>>>> +
>>>> +If the touch sequence is not exclusively delivered to any client through a grab,
>>>
>>> this may be a language barrier issue, but "not exclusively delivered to any
>>> client" doesn't sound exclusive at all to me.
>>
>> Part of a touch sequence may be delivered to a client without actually
>> being "exclusively" delivered to a client. For example, the first button
>> press may be delivered to a passively grabbing window manager before
>> being replayed. This is a delivery of part of the touch sequence, but
>> it's not an "exclusive" delivery.
> 
> "If the touch sequence is not delivered to a grabbing client, ..."

Sounds good.

BTW, I thought of a better approach for handling touches on an indirect
device when the pointer moves out of the touch selecting window or is
grabbed by another client. Instead of cancelling all current touches on
all attached devices, we could put the burden on the clients to watch
for LeaveNotify events. These events are sent whenever the pointer
leaves the window or is grabbed, and the client should stop handling
touch events, with the exception of touch end events, until an
EnterNotify event is received.

An alternative is to enforce this in the server by withholding touch
events, with the exception of touch end, in these cases. When the
pointer returns or the grab is released, a touch begin event is sent for
each new touch and a touch update event is sent for each touch that
changed state during the withheld period.

If these ideas sound good, I'll probably implement the first in Ubuntu
Natty (i.e. an xserver implementation no-op :) due to time constraints,
but I would advocate for the second in the real protocol. In Ubuntu we'd
make a release note about the difference in operation.

I'd like to reiterate that things work pretty well in Ubuntu right now.
I know the protocol work is long and difficult, but it is working well
in practice. And believe me, using a reparenting window manager with a
touch rejecting gesture recognizer and actively grabbing menu drop downs
will sort out most of the protocol corner cases pretty quick :). I
believe the hardest part left is in communicating the fine details in
English.

Thanks,

-- Chase