X Gesture Extension protocol - draft proposal v1

Sun Aug 29 20:53:13 PDT 2010

On Fri, Aug 27, 2010 at 12:53:03PM -0400, Chase Douglas wrote:
> On Fri, 2010-08-20 at 13:36 +1000, Peter Hutterer wrote:
> > On Thu, Aug 19, 2010 at 11:35:07AM -0400, Chase Douglas wrote:
> > > On Thu, 2010-08-19 at 09:10 +1000, Peter Hutterer wrote:
> > > > On Wed, Aug 18, 2010 at 05:02:57PM -0400, Chase Douglas wrote:
> > > > > On Wed, 2010-08-18 at 21:54 +0200, Simon Thum wrote:
> > > By adding the protocol to the server, we just provide a mechanism for
> > > gesture recognition if people want it. If you don't have recognition and
> > > you want it, you can just install a recognizer package and it will run
> > > when you start X. If you have recognition and you don't want it, you can
> > > remove the recognizer or stop it from starting up. If people want, we
> > > could define an option so that you could forbid gesture recognizers from
> > > registering.
> > 
> > now I'm confused again. can you confirm the following:
> > - there's only one GE at a time. the GE is per server, not per client.
> > - if one GE is registered, clients have to take it or leave it, because
> >   replacing it means replacing it for all other clients.
> >   (so if e.g. GNOME and KDE decide to implement two different ones, then
> >   mixing the two will be iffy.)
> 
> Correct. Hopefully there will only be one engine that is used in most
> cases though, just as how most people mostly use the SLAB allocator in
> the kernel even though there are other options.

I'd say this is overly optimistic. user interfaces often don't have a
technically superior solution, much of it depends on look and feel as well
as user expectation. My bet is that we'll be as successful in having that
one gesture engine as we are in having that one window manager and that one
toolkit.

> > how do I "stop the GE from starting up" if another client already needs
> > it? 
> > I seem to be missing something here.
> 
> I'm just providing a way to inhibit a gesture engine from starting up at
> all, if someone really wanted to do so. If another client wants the
> gesture engine (few clients should be written to *need* a gesture
> engine) but it's inhibited, then too bad. 

urgh. while we generally rely on clients being nice to each other, this
seems a bit...iffy.

> The propose of the
> configuration option is just to cater to people who feel they want the
> absolute minimum latency, though I don't think anyone should notice
> latency in practice.

latency in gestures is probably negligible but latency in direct-touch input
is a huge problem. there's already little useful feedback on most touch
system hardware so if nothing happens you cannot be sure whether a touch
hasn't registered, or you're just not pressing hard enough, or the system is
broken, etc.

latency plays into the same issue. if it gets too high, you're likely to
touch harder, or move the finger, or do something else. by then you're
losing that little feedback you already have by changing the input all the
time, never getting really used to the system and it's input parameters.

IMO we should strive for direct-touch to be as instantaneous as possible.

> > > Now, once a gesture engine is registered to the server it is an extra
> > > code path for all inputs to all clients. However, there's no way around
> > > that if you want to support environment/window manager gestures.
> > > 
> > > If this is all terribly wrong-headed and you are right that gestures
> > > should be done above X, we haven't really caused much issue by
> > > attempting this. Just disable gesture support through X as I described
> > > above and use a recognizer in a toolkit or some other client-side
> > > library.
> > 
> > But if we disable the GE in the server and use a client-side library, what
> > do we have the server one for (and the protocol support, including all the code
> > pieces required in the server)?
> 
> I'm just trying to be flexible here :). We can implement the X gesture
> extension, but if no one ends up liking it nor using it, then it gets
> deprecated and removed and everyone moves on. We won't be fundamentally
> changing the way the server works to the point that going back and
> removing the extension later will be an issue.

just as a warning: once you ship the extension, it'll be very hard to just
move on. I've fixed quite a few motif bugs in recent months and while I wish
I could tell them to move on, that ain't happening anytime soon. clients
follow their own schedule and the same is true for a new feature like
gesture extension. If you want uptake, you need to ship it and have clients
use it. Once you do that, simply deprecating it isn't quite as simple
anymore.

> > ok, then this runs into the problem of the GE hogging the events in some
> > cases (think of that google maps use-case in the last email). The GE will
> > always zoom, even if both of my fingers were on manipulatable objects and
> > all I wanted to do was move those.
> 
> I suggest that the client listen for pinch and pan gestures. The same
> data as the raw MT events could be gleaned in many different ways:
> 
> 1. The bounding box of the points that comprise the gestures
> 2. The focus point and deltas for the pinch and pan gestures
> 3. The raw MT event data encapsulated in the gesture events
> 
> The client can use whichever data is easiest for them to handle. For
> example, if the maps api wants zooming to be a certain amount in or out
> centered around a given point, then approach 2 would work easiest. If
> the api wants the two locations to pin to screen coordinates, then
> approach 1 or 3 would work.

no. The whole point of gestures is that I don't have to _care_ about the raw
data, I should only care about zoom this much, rotate by this much. If the
client has to look at the raw gesture data to figure out if there's some
non-gesture data hidden in there we're putting the cart before the horse.
and though that means we don't have to look at a horse's backside all the
time, it is problematic for transportation purposes.

Either way, this is the main disagreement we have so far. I think of
gestures as a second-tier concept, whereas you designed it as a first-tier
concept. Both approaches have merit, we need to meet at some point where
we're both confident we're not abandoning our basic approach.

> > I don't see yet how you can register for both touch and gesture events
> > without running into these issues.
> 
> I'm not sure how useful it would be either, but I don't want to codify
> mutual exclusion in the protocol unless it's absolutely needed.
> 
> > > > > I don't fully understand your last sentence, but I will try to address
> > > > > latency concerns. I think our current gesture recognition code is less
> > > > > than 500 lines of code (maybe more near 300 lines, Henrik wrote it and
> > > > > has more details if you are interested). Obviously, you can do a lot in
> > > > > a small amount of code to kill latency, but I think Henrik has crafted a
> > > > > tight and fast algorithm. I was skeptical about latency at first too,
> > > > > but human interfaces being what they are, we should have plenty of cpu
> > > > > cycles to do all the gesture primitive recognition we need (please don't
> > > > > read this and assume we're pegging the processors either :). So far in
> > > > > testing, I haven't seen any noticeable delay, but it's still rather
> > > > > early in development.
> > > > 
> > > > I don't think the algorithm is what's holding you back anyway, it's the
> > > > nature of gestures and human input in general.Even if you GE is
> > > > instantaneous in the recognition, you may not know for N milliseconds if the
> > > > given input may even translate into a gesture. Example: middle mouse button
> > > > emulation code - you can't solve it without a timeout.
> > > 
> > > Yes that's true. This is an area where I think we need to do some
> > > research and user testing once we have the foundation implemented. As
> > > for specifics, I feel Henrik is more qualified to respond. I'll just
> > > note that in my own testing I think Henrik's implementation works well
> > > all around. We'll have more useful experiences once the Unity window
> > > manager is fully integrated and we can test it out. That should occur
> > > within the next week or two.
> > 
> > how many real multitouch applications do you have? ones that don't use
> > gestures but direct touch interaction?
> 
> I'm not sure I understand the point of the question, since MT through X
> doesn't exist yet. If you're talking about something written through
> PyMT or another library that goes around X, I haven't personally played
> with any. For Maverick, we're really just targetting gesture use cases
> since that's the only solution that works well with X.

the point of the question was: we have two partially overlapping needs for
multi-touch. one is to have multitouch input - however that looks. the other
one is gestures.

gestures are a subset of touch input though, so we need to make sure to get
touch working and figure out how to do gestures based on that. The other way
round will lead us head-first into a wall.

the gesture extension so far seems to be aimed at the second use-case, for a
specific set of gestures, neglecting much of real multi-touch input. and
quite frankly: to me it reads as "we need to get this set of gestures
working now, come what may". Feel free to correct me on this point.

So if you don't have any applications other than the window manager that
interprets gestures, I don't think this is enough to claim that gesture
integration works nicely.

Cheers,
 Peter