glyph-pixmaps merged

Eric Anholt eric at anholt.net
Fri Oct 19 18:19:21 PDT 2007


I pulled the glyph-pixmaps code by cworth into master today.  This
removes the use of the nasty Glyphs screen hook that EXA had contorted
itself to accelerate.  The result previously had been that each glyph
drawing required re-uploading the glyph data to the card, which was a
synchronous operation unless you used nasty uploading hacks that ignored
cache effects on your hardware (UploadToScratch in EXA) or could do
hostdata uploads through the ringbuffer (even when the hardware can do
this, it's usually a pain to do and a limited performance win).  An
alternative approach was that of XGL, which added a bunch of
infrastructure to have private data associated with glyphs, where a
second copy of these rectangles-of-pixel-data were stored in another
format that could be accelerated from.

Instead, the new code stores all of these rectangles of pixel data in
pixmaps, then does the compositing of those pixmaps using the usual
composite hook.  This should be welcomed by all acceleration
architectures and DDX implementations, except possibly XAA.  XAA's old
hook will no longer get called, so the XAA non-component-alpha glyphs
acceleration code (called CPUToScreenAlpphaTexture) won't get called.
Of the few drivers that ever had XAA non-CA glyphs code, most of them
had had that code disabled for being broken anyway (or didn't have it
disabled, despite being broken).

Since we now are just accelerating pixmaps like any other composite
operation, it's a big performance win on 915 text rendering with EXA:

 before          after           Operation
--------   -----------------   -----------------
 66800.0   152000.0 (  2.28)   Char in 80-char aa line (Charter 10) 
 55900.0   137000.0 (  2.45)   Char in 30-char aa line (Charter 24) 
104000.0   129000.0 (  1.24)   Char in 80-char a line (Charter 10) 
 61900.0    44200.0 (  0.71)   Char in 30-char a line (Charter 24) 
 66200.0   152000.0 (  2.30)   Char in 80-char rgb line (Charter 10) 
 39400.0    89100.0 (  2.26)   Char in 30-char rgb line (Charter 24) 

On 965, due to the synchronous compositing that cworth and keithp have
been working on fixing, it's a less significant win

   before        after           Operation
--------   -----------------   -----------------
 41200.0    63000.0 (  1.53)   Char in 80-char aa line (Charter 10) 
 36400.0    52200.0 (  1.43)   Char in 30-char aa line (Charter 24) 
171000.0   260000.0 (  1.52)   Char in 80-char a line (Charter 10) 
109000.0    84300.0 (  0.77)   Char in 30-char a line (Charter 24) 
 40000.0    60400.0 (  1.51)   Char in 80-char rgb line (Charter 10) 
 23900.0    44400.0 (  1.86)   Char in 30-char rgb line (Charter 24) 

965 XAA isn't hurt for antialiased text rendering, but something went
weird with Xft non-antialiased rendering.
 before            after         Operation
--------   -----------------   -----------------
173000.0   176000.0 (  1.02)   Char in 80-char aa line (Charter 10) 
 64300.0    65000.0 (  1.01)   Char in 30-char aa line (Charter 24) 
3150000.0   288000.0 ( 0.091)  Char in 80-char a line (Charter 10) 
558000.0    92900.0 (  0.17)   Char in 30-char a line (Charter 24) 
145000.0   147000.0 (  1.01)   Char in 80-char rgb line (Charter 10) 
 47600.0    48000.0 (  1.01)   Char in 30-char rgb line (Charter 24)

It may be that the glyphs implementations on EXA and XAA had some
relevant paths for improving the performance of the non-antialiased
code.  They could handle those cases in their Composite handler if
desired, or there may be some generic improvements to be made to the old
miGlyphs (now CompositeGlyphs) code to improve this case.

The merge resulted in a net loss of code, but there's also a bunch of
code to remove, still.  Now that Glyphs won't be seen beyond the DIX
implementation of Render, we can remove the glyph privates,
Realise/UnrealizeGlyphs hooks, and the remaining Glyphs hook
implementations.  In the process I'll try to see what XAA and EXA were
doing before that might still be relevant.

Long-term there is additional work to be done in looking at whether we
can make our text even faster by reducing the number of state updates
required for the hardware in drawing text.  Having 1000 tiny little
surfaces for your 1000 tiny little glyphs and having to keep a surface
state cache entry for each one to get good performance may not be so
hot, and there must be non-zero cost for changing state per glyph.  It
may be that we end up wanting glyphs to be stored in large pixmaps
together -- probably a win for hardware and our drivers by having
reduced amount of state for text rendering, and may even be better for
software with better CPU caching when you're rendering many glyphs in a
row.  We'll have to see.

-- 
Eric Anholt                             anholt at FreeBSD.org
eric at anholt.net                         eric.anholt at intel.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 187 bytes
Desc: This is a digitally signed message part
URL: <http://lists.x.org/archives/xorg/attachments/20071019/c96c7f68/attachment.pgp>


More information about the xorg mailing list