render improvements

Fri Apr 15 16:43:27 PDT 2005

> The speedup using a recent x.org server is less, we currently suspect that 
> framebuffer reads are the limiting factor now. We're not sure why they are so 
> slow, but I'm hope on the list can enlighten us. My current suspicion is that 
> they are not cached in the CPU at all and always go directly to graphics 
> memory.
> 
> We can try to reduce framebuffer reads to speed this up (I have some ideas how 
> to do this), but we wonder if there are ways to speed up the reads from the 
> framebuffer, at least from offscreen memory?

Yes, framebuffer is usually mapped uncacheable. On some architecture, it
can have some kind of prefetch but that is not always the case. That
also why it's usually better to use the largest possble transfer size
from/to the framebuffer to generate long bursts. For example, using the
Altivec/VMX on ppc would allow 16 bytes bursts. I suppose MMX/SEE would
allow similar, in addition to the actual parallelisation of pixel
processing of course.

Ben.