problem with exaBufferGlyph()

Michel Dänzer michel at daenzer.net
Sat Jan 7 07:44:51 UTC 2017


On 23/12/16 03:39 PM, Michael wrote:
> Hello,
> 
> first some context - I've been writing EXA support for Permedia 2 and
> 3, mostly because these cards are still kinda useful on sparc and alpha
> hardware. For pm2 there's actually documentation, and the chip can be
> used to accelerate at least some xrender operations.
> The problem - this chip can't deal with A8 masks for rendering glyphs.
> It's perfectly happy to render ARGB though, and that's a problem with
> current EXA.
> As it is right now, exaGlyphs() will call CheckComposite() with an A8
> Picture as destination to see if the driver supports that, and fall
> back to ARGB if it doesn't. That's fine, although it may be better to
> do that test once on startup instead of every time a glyph needs to be
> drawn.
> The problem is, that exaBufferGlyph() will always cache glyphs in the
> format returned by GetGlyphPicture(), not the one requested in the
> destination Picture handed to it. To drivers that can't support A8 this
> will render the cache unusable to the accelerator, which results in
> glyphs constantly being copied back & forth between video and main
> memory, which kills performance to the point that software rendering is
> faster.
> 
> So, what I'm proposing is something like this:
> diff -u -r1.2 exa_glyphs.c
> --- exa_glyphs.c        22 Dec 2016 21:31:08 -0000      1.2
> +++ exa_glyphs.c        23 Dec 2016 05:42:08 -0000
> @@ -544,7 +544,20 @@
>                 INT16 ySrc, INT16 xMask, INT16 yMask, INT16 xDst, INT16 yDst)
>  {
>      ExaScreenPriv(pScreen);
> +    /*
> +     * XXX
> +     * Request the new glyph in the format we need to draw in, not whatever
> +     * GetGlyphPicture() hands us, which will (almost?) always be A8.
> +     * That way drivers that can't handle A8 but can do Xrender ops in ARGB
> +     * will be able to do hardware rendering in and out of the glyph cache.
> +     * This results in a major performance boost on such hardware.
> +     * Drivers that can handle A8 shouldn't see any difference.
> +     */
> +#if 1
> +    unsigned int format = pDst->format;
> +#else
>      unsigned int format = (GetGlyphPicture(pGlyph, pScreen))->format;
> +#endif
>      int width = pGlyph->info.width;
>      int height = pGlyph->info.height;
>      ExaCompositeRectPtr rect;
> 
> Without this, I get about 9000/sec with x11perf -aaftext on an Ultra 60
> - software rendering yields 15000/s. With this it's 75000/s. Not
> earth-shatteringly fast but still more than what I expected from such
> an old chip that wasn't exactly known for its speed even back in its
> day.
> 
> Any thoughts? Am I missing something?

Your change makes sense to me. Please submit a patch which just changes
the format assignment, no need for the comment and preprocessor guards.
Maybe it can also remove this code, since I don't think composite
operations to 1bpp destinations can ever be accelerated:

    if (PICT_FORMAT_BPP(format) == 1)
        format = PICT_a8;

If so, the local variable format could be eliminated in favour of using
pDst->format directly.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the xorg-devel mailing list