[PATCH] Ensure blitter quiescience before reading pixels from the framebuffer

Michel Dänzer michel at tungstengraphics.com
Tue Jul 31 06:48:05 PDT 2007


On Tue, 2007-07-31 at 15:07 +0200, Bernardo Innocenti wrote:
> Michel Dänzer wrote:
> > On Mon, 2007-07-30 at 16:39 +0200, Bernardo Innocenti wrote:
> >> Michel Dänzer wrote:
> 
> And, from what I've seen, all antialiased primitives require drawing in a a8
> off-screen bitmap (unaccelerated) and then composing a source bitmap through
> the a8 mask.  For solid fills, the source is always a repeated 1x1 bitmap.

I also recently discovered that this can be a pathological path, but
AFAICT the bottleneck isn't what you think it is (yet) but the fact that
the a8 pixmap gets fully initialized by the blitter and then immediately
partly overwritten by software rendering. This is especially bad with
the current pixmap migration schemes because it means kicking off the
blitter, waiting for it to finish and then reading back the full pixmap
contents to system memory, which usually involves at least one slow
memcpy. This will at least partly go away when using TTM for pixmaps,
but meanwhile I've played a little with just taking note that a pixmap
is fully covered by a solid colour and only actually initializing its
contents when and where appropriate. I haven't got it working yet though
unfortunately.

> Along with the rendering of glyphs, these small but frequent operations are
> likely to dominate rendering time for the typical desktop.

FWIW, the last text rendering speed numbers I saw didn't seem much if at
all worse with EXA than with XAA.


> > I guess it's just not feasible to accurately estimate performance from
> > code inspection. It needs to be measured.
> 
> I wanted to do it at some point, but running oprofile on slow hardware is
> quite painful.  And, still, you need to do some guessing when you interpret
> the results.
> 
> For instance, I expect to see a lot of time spent in the driver, but mostly
> because EXA is asking it to do spurious uploads of small bitmaps.

IME using sysprof instead of oprofile helps thanks to its intuitive
callgraphs.


-- 
Earthling Michel Dänzer           |          http://tungstengraphics.com
Libre software enthusiast         |          Debian, X and DRI developer




More information about the xorg mailing list