LibXft : xftglyphcore woes

Charles Lindsey chl at clerew.man.ac.uk
Wed Nov 26 09:59:21 PST 2008


Summary:
--------
XftGlyphCore can waste a lot of time if asked to write glyphs outside of  
its drawable.
There is one major application (Opera/QT) that hits this problem in Spades.
I propose a patch to fix it.

XftGlyphCore:
-------------

These remarks relate to the version of xftcore.c identified by
  * $Id: xftcore.c,v 1.4 2005/07/03 07:00:57 daniels Exp $
My system is  (via uame -a)
SunOS clerew 5.10 Generic_118822-25 sun4u sparc SUNW,Ultra-2

Essentially, XftGlyphCore is provided with a Drawable (via XftDraw), a  
bunch of glyphs, and a point (x,y) at which they are to be drawn. When  
used in anti-aliasing mode (when the XftDraw specifies TrueColor), it can  
consume a lot of resources (including at least one round-trip to the  
Xserver from a call of XGetImage). Even if the area where the glyphs are  
to be drawn does not intersect with any area within the drawable, it still  
goes through the whole process of drawing the glyphs, right through to  
obtaining the (supposed) existing background, calculating how the glyphs  
are to be merged with it, and writing the results to an (supposed) XImage  
which is never used.

In particular, the call of XGetImage fails, and it reverts to using its  
'use_pixmap' mode for the next few calls. This involes a call of  
XCreatePixmap to create a pixmap the size of the bunch of glyphs, a call  
of XCreateGC and of XCopyArea to populate it with the (supposed)  
background, and a further call of XGetImage to get that background in  
XImage form (that's three Xserver round-trips).

It then writes those glyphs to the XImage, and uses XPutImage to write  
them back to the Drawable (which then discovers there is no intersection,  
and so ignores them). No harm is done; nothing breaks; but if you do it  
often it consumes huge resources.

But why, you may ask, should anybody in his Right Mind call XftGlyphCore  
to write glyphs that are not even inside the Drawable? A Good Question,  
indeed, but sadly there is one major applications that does it all the  
time :-( .

Opera/QT:
---------

The Opera Web Browser is written on top of the QT Toolkit, which in turn  
is written on top of LibXft. It includes a feature for reading and  
composing emails, and hence contains a text editor (also used when filling  
in Web Forms). I had long been aware that it had started to consume vast  
resources when composing large emails (or replying to large emails), and a  
long moan on opera.os.solaris had produced zilch response. So in  
desperation I set out to discover what was happening.

The first observations was that it only happened on one of the two screens  
on my machine (the one behind the fancy Creator Graphics card). After much  
poking around with truss and mdb, I discovered where the machine cycles  
were going to and, after downloading the source code of LibXft, I saw that  
the problem was related to the use of 24bit color plus TrueColor (my other  
screen uses 8bit color plus PseudoColor). Note that, up until that time, I  
had ever even heard of LibXft, or of the Render extensions, or of  
anti-aliasing (thanks to Wikipedia for explaining that). Though I must  
confess that, to those of us whose accomodation is long gone and who have  
to sit at a very precise distance from the screen to see it all in focus,  
anti-aliasing does indeed give quite an improvement. So I had a very steep  
learning curve to follow :-( .

Anyway, I eventually pieced together what Opera plus QT was actuaslly  
doing, so here it is (it is not a pretty story, and I have yet to discover  
whether it is an Opera problem or a QT problem).

Opera keeps a record of all the "word"s written to the editing window (a  
"word" is essentially a sequence of alphanumeric characters - any other  
character seems to be treated as a word of its own). Such words are used  
in calls of XftDrawString16, which duly calls XftGlyphCore. Each time you  
type a character (or use an arrow key, or delete a character) it discovers  
which bit of the window it needs to redraw, and constructs a brand new  
Pixmap of that size and prefills it with the supposed background of the  
window at that place (which, in practice, is always just pure white  
pixels). So now it needs to copy the required glyphs to that Pixmap  
(XftDrawString16), and when that is done it copies the Pixmap back to the  
original Window using XCopyArea, and then it throws the Pixmap away. A bit  
long-winded you might think, but You Ain't Seen Nothin' Yet.

For, to do this, it needs to know which glyphs are to be written into this  
(usually small) Pixmap. You might think that was a straightforward task,  
but No! It systematically goes through the WHOLE WINDOW, rewriting All the  
"words" known to be in it to that small Pixmap, whether they belong there  
or not. Most of them don't, of course! So, it your window is full of text,  
and you type some characters in at a reasonable typing speed, you can then  
sit an watch for several seconds while they all gradually appear (cursor  
movements and backspaces included) one-by-one. Not a pleasant way to  
construct your emails :-( .

But there is worse to come! Being an editing window, it naturally contains  
a cursor (this is the point-of-insertion cursor, not the mouse cursor).  
And this cursor blinks - 1/2 second on, 1/2 second off. Now it has the  
good sense not to use XDrawString16 to draw the cursor, BUT it does regard  
the cursor as part of the background, and so whatever glyph there might be  
at that point has to be re-anti-aliassed. You can see what is coming ...

Twice every second, it has to redraw every "word" in the window, on the  
offchance that it overlaps the 2x15 Pixmap where the cursor is ........

OK, time for some numbers. The worst case is when the window contains  
"words" of 1 character each, so I wrote a window containing alternate 'x'  
and SP - that's 1700 'x's altogether, and observed the CPU load involved  
just to keep that cursor blinking.

Now my machine has two processors of 300MHz each (there are faster machine  
around, but that is still quite some computing power), and of those two  
processors
   XSun  was using 32.7%  - call it 65% of one processor
   Opera was using 26.0%  - call it 52% of the other processor
just to keep the cursor blinking. After applying the Patch which I shall  
describe, that reduced to
   XSun  was using 0.3%
   Opera was using 3.5%
and now I can compose my emails in peace again.

But what an incredibly Stupid way to program an application! Yes, I shall  
be moaning again to the Opera (or QT) people, but in the meantime I think  
LibXft needs to be made proof against such stupidities, because stupid  
applications are still going to happen.

xftcore.patch:
--------------

I have attached my Patch. It essentially does three things:

The macro XftIntMult is modified to optimize the case where the background  
is pure white or the glyph color is opaque. This was an early mod I made,  
and though possibly useful is not essential.

_XftSmoothGlyphGray8888 is modified so that it only draws the part of the  
glyph(s) that intersect with the XImage of the Drawable (which is always a  
Pixmap in the Opera case). Without this, there is now a danger of writing  
over unallocated storage.

XftGlyphCore now uses XGetGeometry to discover the size of the Drawable  
(cacheing it in a static variable to save Xserver round-trips). Then it  
determines the intersection with the glyphs to be drawn, bailing out if  
the intersection is empty. Finally, it draws whatever portion of the  
glyphs lies within the intersection. It also, for good measure, checks the  
intersection and bails out in the same way when sharp glyphs are used.

Of course, this all causes some extra overhead in cases where the all the  
glyphs do lie within the Drawable, but not too much of it AFAICS.

Note that, if this patch gets adopted, it will probably be necessary to  
apply similar treatment to XftGlyphSpecCore and to the other  
_XftSmoothGlyph*, and I would be happy to work on that if needed (though I  
am not sure I could test them). But what I have done so far is sufficient  
for my present need, and for proof of concept.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131                       
   Web: http://www.cs.man.ac.uk/~chl
Email: chl at clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xftcore.patch
Type: application/octet-stream
Size: 8008 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg/attachments/20081126/d8039d57/attachment.obj>


More information about the xorg mailing list