[PATCH] EXA: Move floating point math to the GPU as much as possible for R1-5xx.
Michel Dänzer
michel at daenzer.net
Mon Oct 5 02:42:24 PDT 2009
On Mon, 2009-10-05 at 01:38 -0400, Alex Deucher wrote:
> 2009/10/3 Michel Dänzer <michel at daenzer.net>:
> > From: Michel Dänzer <daenzer at vmware.com>
> >
> > Also add fast paths for untransformed Composite operations.
> >
> > This can significantly reduce the CPU overhead in RadeonCompositeTileCP, at
> > least for TCL capable GPUs.
> > ---
> >
> > I think the basic idea is sound, but I'm not sure if some parts are going too
> > far, e.g. the float fw, fh locals in the fastpath. Opinions?
>
>
> Looks pretty good. What sort of improvements are you seeing?
Not sure I've measured this one separately, but together with the
changes I pushed recently I've seen an x11perf -aa10text speedup on the
order of 10-20%, both with and without KMS.
> Are there any improvements to the non-tcl path?
Hmm probably not as is, but it might be possible to use the fast path as
well at least in the untransformed case.
> If you wanted to take this a step further you could add some instructions
> take the reciprocal in the shader.
Right, but I wouldn't expect that to make any significant difference,
the setup overhead seems small compared to RadeonCompositeTileCP. Also
I'm not planning to mess with shaders in such a low-level form, but feel
free. :)
> Also, we don't yet take advantage of the tcl hw on r1xx and r2xx chips.
Yeah, that might be a worthwhile project for those with such hardware.
--
Earthling Michel Dänzer | http://www.vmware.com
Libre software enthusiast | Debian, X and DRI developer
More information about the xorg-driver-ati
mailing list