Glyph rendering
Huang, FrankR
FrankR.Huang at amd.com
Sun Jul 25 23:27:36 PDT 2010
Based on the debug and experiment, geode graphics processor will be used for the middle dword PictOpAdd operation rendering. Because there is no writemask in geode HW, so modulus(%) with 4 must be garanteed using this method at first.
With the code improvement, if the destination start point can not be modulus with 4 to zero, we let driver do SW "*dest = *dest + *src" calculation firstly to the first pixels. Alike, the driver do that SW rendering for the remaining pixels for each line. That is to say, we will make full use of GP of geode to do the rendering. Use a picture to demonstrate:
%/4 1 2 3 0 1 2 3 0 1 2 3 0 1 2
destOffset: x x x x x x x x x x x x x x
HW x x x x (x x x x)
SW x x x x x x
But from my debug, the last dword(content in bracket) is wrongly rendered, so I render the (dest%4 -1) dwords in HW. The next picture is the actually rendered result:
%/4 1 2 3 0 1 2 3 0 1 2 3 0 1 2
destOffset: x x x x x x x x x x x x x x
HW x x x x
SW x x x x x x x x x x
That is to say, if the width of glyph is bigger, our rendering performance is better.
For my test,
"x11perf -aa10text" : 46400/s
"x11perf -aa24text" : 18300/s
They are improved 10 times than before.
Thanks,
Frank
> -----Original Message-----
> From: Michel Dänzer [mailto:michel at daenzer.net]
> Sent: 2010年7月21日 16:24
> To: Huang, FrankR
> Cc: Mart Raudsepp; Torres, Rigo; Writer, Tim; xorg-devel at lists.x.org;
> xorg-driver-geode at lists.x.org; Cui, Hunk; Deucher, Alexander
> Subject: RE: Glyph rendering
>
>
> [ Fixed your quoting, please consider using a better e-mail client ]
>
> On Mit, 2010-07-21 at 16:11 +0800, Huang, FrankR wrote:
> >
> > From: Michel Dänzer [mailto:michel at daenzer.net]
> >
> > > On Mit, 2010-07-21 at 15:30 +0800, Huang, FrankR wrote:
> > > >
> > > > But as you known, for the PICT_a8r8g8b8 method, the width and height
> > > > of source sometimes can not be divied by 4(such as 5...), so the
> > > > remaining pixel PictOpAdd should be done by SW code.
> > >
> > > The height doesn't matter, and if there's a writemask it should be
> > > possible to use that to mask out source/destination pixels that don't
> > > align to an ARGB pixel.
> >
> > Yes. My description is not accurate, we only care width for this
> > condition. Do you mean the writemask implemented in HW? From what I
> > found, it is not in geode GP. :(
>
> Yes, that's what I meant.
>
>
> > > > For the mixed way(HW+SW as I described above), the speed can be
> > > > 50000/s, unfortunely the result still is not correct(seems correct
> by
> > > > debugging, I'm still checking it).
> > >
> > > Sounds like maybe you're not properly synchronizing between GPU and
> CPU
> > > access.
> >
> > Michael,
>
> Ahem.
>
> > maybe you misunderstand. The "SW" I mean is that our driver still use
> > a formula to do the "+" operation in video memory instead of fallback
> > to server handling(may be you means this). We don't fallback anymore.
>
> I figured that, which means the driver is responsible for making sure
> the GPU and CPU properly see each other's rendering results.
>
>
> --
> Earthling Michel Dänzer | http://www.vmware.com
> Libre software enthusiast | Debian, X and DRI developer
More information about the xorg-devel
mailing list