X Gesture Extension protocol - draft proposal v1

Fri Aug 27 09:53:03 PDT 2010

On Fri, 2010-08-20 at 13:36 +1000, Peter Hutterer wrote:
> On Thu, Aug 19, 2010 at 11:35:07AM -0400, Chase Douglas wrote:
> > On Thu, 2010-08-19 at 09:10 +1000, Peter Hutterer wrote:
> > > On Wed, Aug 18, 2010 at 05:02:57PM -0400, Chase Douglas wrote:
> > > > On Wed, 2010-08-18 at 21:54 +0200, Simon Thum wrote:
> > By adding the protocol to the server, we just provide a mechanism for
> > gesture recognition if people want it. If you don't have recognition and
> > you want it, you can just install a recognizer package and it will run
> > when you start X. If you have recognition and you don't want it, you can
> > remove the recognizer or stop it from starting up. If people want, we
> > could define an option so that you could forbid gesture recognizers from
> > registering.
> 
> now I'm confused again. can you confirm the following:
> - there's only one GE at a time. the GE is per server, not per client.
> - if one GE is registered, clients have to take it or leave it, because
>   replacing it means replacing it for all other clients.
>   (so if e.g. GNOME and KDE decide to implement two different ones, then
>   mixing the two will be iffy.)

Correct. Hopefully there will only be one engine that is used in most
cases though, just as how most people mostly use the SLAB allocator in
the kernel even though there are other options.

> how do I "stop the GE from starting up" if another client already needs
> it? 
> I seem to be missing something here.

I'm just providing a way to inhibit a gesture engine from starting up at
all, if someone really wanted to do so. If another client wants the
gesture engine (few clients should be written to *need* a gesture
engine) but it's inhibited, then too bad. The propose of the
configuration option is just to cater to people who feel they want the
absolute minimum latency, though I don't think anyone should notice
latency in practice.

> > Now, once a gesture engine is registered to the server it is an extra
> > code path for all inputs to all clients. However, there's no way around
> > that if you want to support environment/window manager gestures.
> > 
> > If this is all terribly wrong-headed and you are right that gestures
> > should be done above X, we haven't really caused much issue by
> > attempting this. Just disable gesture support through X as I described
> > above and use a recognizer in a toolkit or some other client-side
> > library.
> 
> But if we disable the GE in the server and use a client-side library, what
> do we have the server one for (and the protocol support, including all the code
> pieces required in the server)?

I'm just trying to be flexible here :). We can implement the X gesture
extension, but if no one ends up liking it nor using it, then it gets
deprecated and removed and everyone moves on. We won't be fundamentally
changing the way the server works to the point that going back and
removing the extension later will be an issue.

> ok, then this runs into the problem of the GE hogging the events in some
> cases (think of that google maps use-case in the last email). The GE will
> always zoom, even if both of my fingers were on manipulatable objects and
> all I wanted to do was move those.

I suggest that the client listen for pinch and pan gestures. The same
data as the raw MT events could be gleaned in many different ways:

1. The bounding box of the points that comprise the gestures
2. The focus point and deltas for the pinch and pan gestures
3. The raw MT event data encapsulated in the gesture events

The client can use whichever data is easiest for them to handle. For
example, if the maps api wants zooming to be a certain amount in or out
centered around a given point, then approach 2 would work easiest. If
the api wants the two locations to pin to screen coordinates, then
approach 1 or 3 would work.

> I don't see yet how you can register for both touch and gesture events
> without running into these issues.

I'm not sure how useful it would be either, but I don't want to codify
mutual exclusion in the protocol unless it's absolutely needed.

> > > > I don't fully understand your last sentence, but I will try to address
> > > > latency concerns. I think our current gesture recognition code is less
> > > > than 500 lines of code (maybe more near 300 lines, Henrik wrote it and
> > > > has more details if you are interested). Obviously, you can do a lot in
> > > > a small amount of code to kill latency, but I think Henrik has crafted a
> > > > tight and fast algorithm. I was skeptical about latency at first too,
> > > > but human interfaces being what they are, we should have plenty of cpu
> > > > cycles to do all the gesture primitive recognition we need (please don't
> > > > read this and assume we're pegging the processors either :). So far in
> > > > testing, I haven't seen any noticeable delay, but it's still rather
> > > > early in development.
> > > 
> > > I don't think the algorithm is what's holding you back anyway, it's the
> > > nature of gestures and human input in general.Even if you GE is
> > > instantaneous in the recognition, you may not know for N milliseconds if the
> > > given input may even translate into a gesture. Example: middle mouse button
> > > emulation code - you can't solve it without a timeout.
> > 
> > Yes that's true. This is an area where I think we need to do some
> > research and user testing once we have the foundation implemented. As
> > for specifics, I feel Henrik is more qualified to respond. I'll just
> > note that in my own testing I think Henrik's implementation works well
> > all around. We'll have more useful experiences once the Unity window
> > manager is fully integrated and we can test it out. That should occur
> > within the next week or two.
> 
> how many real multitouch applications do you have? ones that don't use
> gestures but direct touch interaction?

I'm not sure I understand the point of the question, since MT through X
doesn't exist yet. If you're talking about something written through
PyMT or another library that goes around X, I haven't personally played
with any. For Maverick, we're really just targetting gesture use cases
since that's the only solution that works well with X.

Thanks,

-- Chase