[PATCH 0/3] present: Improve interactions with compositing manager

Owen Taylor otaylor at redhat.com
Mon Jan 26 17:01:17 PST 2015


On Mon, 2015-01-26 at 16:22 -0800, Keith Packard wrote:
> Owen Taylor <otaylor at redhat.com> writes:
> 
> > Sorry for getting to this so slowly. At least for the algorithms that
> > GNOME currently uses, this isn't going to help avoid the extra frame of
> > latency.
> 
> Maybe you don't quite understand this change?

I think I do....

> > What we do instead start drawing the frame at a fixed point in the frame
> > cycle. The point isn't chosen "as late as possible", since that's hard
> > to predict reliably - after all, we can get damage events for drawing
> > that isn't even completed on the GPU (at least we could with DRI2) - but
> > rather is fixed to an arbitrary point 2 ms after VBlank. The idea here
> > is to accept the extra frame of latency and within that constraint make
> > things as predictable and as smooth as possible.
> 
> Let me explain what I've done and see if that will help your current
> situation.
> 
> Right now, a redirected application that calls PresentPixmap will have
> that presentation delayed until the scheduled frame. So, an application
> which is 'keeping up' (presenting contents very soon after top-of-frame)
> will have their new contents waiting around until the next frame begins,
> at which point they will be copied to the redirected pixmap and damage
> delivered to the compositor.
> 
> The change is to copy the contents immediately, send the damage to the
> compositor, and delay sending the PresentNotify event to the client
> until the start of the next frame.
> 
> So, if an application manages to present a new frame in the 2ms after
> vblank, the change will get that up on the screen a full frame earlier.

I think 2ms is generally unrealistic for rendering for an application
that is doing something non-trivial - i.e. for something than glxgears.
Especially when you consider that in order to improve things the app has
to consistently and reliably hit that 2ms - if the rendering sometimes
takes sometimes at 1.9ms and sometimes at 2.1ms, it's going to look
awful.

[ It has to be said that the 2ms is 2ms of *CPU* time unless there's a
glFinish() or equivalent somewhere - that helps a bit ]

The only way that it's going to be reliable is if the application adds
buffering app-side and renders the frame that is present in the 2ms
window ahead of time. And then we've just gone back to having the extra
frame of latency.

> > I think to get rid of the extra frame of latency, you need to be able 
> > "edit" a queued presentation in some fashion. If the compositor could
> > put the active window into a separate hardware overlay then it would be
> > easy. But it might be possible to just present another buffer:
> 
> You can "edit" the queued operation with the Present API. Current kernel
> drivers can't do this with page flipping, but if you force a copy, then
> you can replace that pending operation with another copy at any point.
> 
> If you're doing sub-region updates by copying portions of windows to the
> current scan-out buffer, then you can queue as many as you want for a
> particular frame, and all of them will get executed at vblank time. If
> one of the queued requests subsumes an existing request, the previous
> request will be skipped.

OK - there's definitely stuff there that could be experimented with.

> > Does it *hurt* to make the change you are proposing? Not that I can
> > think of - there's no downside to getting the damage early, even if we
> > don't use it, and the PresentComplete timestamps would only be as
> > inaccurate as they already are.
> 
> If the damage appears before you draw the next frame, I think it'll help.

Thinking some more, maybe it *does* hurt - we've gone from the
application getting consistent latency as long as it renders in less
than 16ms (because the X server was internally making sure that the
damage was delivered in the 2ms window), to adding an arbitrary
threshold where the latency changes.

But then again, I'm not going to claim that the current GNOME/Mutter
algorithm is ideal - it's just something I came up with that kept my
test cases reasonably smooth rather than jerking all over the place.

- Owen


- Owen



More information about the xorg-devel mailing list